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Description 

BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

KL^tS 0 " '"If t6S 9enera " y t0 9r0Wth ,aCt0rS and s P ecificall V to a new member of the transforming growth 
factor beta (TGF-B) superfamily, which is denoted, growth differentiation factor-8 (GDF-8). 

2. Description of Related Art 

ZE?- tran H Sf ° rming g /°: th ,ac,or P ( TGF "W superfamily encompasses a group of structurally-related proteins 

TZZZLTJiml d ' ffe h rentia,i ° n Pr0C6SSeS dUrin9 6mbry0niC deve| 0P™ n '- The family includes, Mullerian 
nhibrting substance (MIS), wh.ch ,s required for normal male sex development (Behringer, et al., Nature, 345:167 

ISni'no r0S °r? deCape " taple 9 ic < DPP > aene P rodu <* which is required for dorsal-ventral axis formation ^ mQ r- 

S ^ , T (Pad9ett ' 61 al " NatUr6 ' ^ :81 " 84 ' 1987 >' the Xen °P us V 9" 1 product, which 
locates to the vegetal pole of eggs ((Weeks, et al., Cell. 51:861-867. 1987), the activins (Mason, et al., Biochem 

xlnnn k ^ :957 " 964 ' 1 986 >' which ca " ""**e »e formation of mesoderm and anterior structures in 

OP Til* ry ° S l T"' 61 a '" Ce "' - 485 ' 1 " 0) ' and ,he bone morphogenetic proteins (BMPs. osteogenin. 
OP-1) which can .nduce de novo cartilage and bone formation (Sampath, et al., J. Biol. Chem., 265:13198 1990) The 
TGF-fte can influence a variety of differentiation processes, including adipogenesis, myogenics, chondrogenesis 
hematopoies.s. and epithelial cell differentiation (for review, see Massague Cell 49 437 1987) aro 9 e nes.s, 

^nTrLJc he P !° X ?T °! TGF " P fami ' y are initia " y s y nthesized as a 'erge precursor protein which subsequently 
undergoes proteolytic cleavage at a cluster of basic residues approximately 1 1 0-1 40 amino acids from the C-terminus 

II t T'l If 9 ' 00 !'- ° r matUfe re9i0nS ' °' ,he Pr ° ,eins are a " structurally related and the different family members 
can be classrfied ,nto d.st.nct subgroups based on the extent of their homology. Although the homologies within par- 
bcular subgroups range from 70% to 90% amino acid sequence identity, the homologies between J^njZZ 

T^lTVT \T 9mg fr ° m ° nly 2 ° % l ° 5 ° % - ' n eaCh Case ' tne active species a PP ea * «o »- a disulfide- 
linked d.mer of C-term,nal fragments. Studies have shown that when the pro-region of a member of the TGF-B family 

of T l mat 7 reSi ° n ° f an ° ther member ° f the TGF - p famil * ^cellular dimenzation and secretion 

1™h y t '"I r°t merS ° CCUr (Gray ' A - and MaS,0n ' A - Science ' 2£Z :1328 - 1990 >- A d"i«onal studies by 
pSpT ' ( °f , nd0Crin ' - 149 ' 1 " 1 ) Sh0W6d that the use of ,ne BMP " 2 P^flton combined with the 

LTl? ThTI °t d ^ amatical| y im P r ° vad expression of mature BMP-4. For most of the family members that 

ShP in! n ,he t n r i 0d,meric s P ecies has been f °und to be biologically active, but for other family members, 
Jkethe,nh*,ns(L^ 

llf ,h ,r and 3PPear l ° h3Ve diff6rent bi0l09ical P r °P erties ,han thTrespective homod^s 

EL nfT ? ■ ^/actors that are tissue-specific in their expression pattern will provide a greater under- 
standing of that tissue's development and function. 

Summary of the Invention 

fiTfr rJS? Pre T ^ ent , i0n Pr ° VideS 3 P 0, y nucleotide encoding a growth differentiating factor-8 polypep- 

tide (GDF-8) as set out in claims 1-4. yH K 

Mtout in^imsTg"' inVenti ° n a ' S0 Pr0Vid6S eXpresSion vector as set out in claims 5 " 7 ' as wel1 as a host cell as 

[00 H° 71 I^T 656 "' inVenti0 " Pr ° VideS GDF " 8 P 0| yP e P tide or a functional fragment thereof as set out in claim 10 
fnnoo^T. Production of GDF-8 polypeptide or functional fragment thereof as set out in claim 11 

[0008] The present invention provides antibodies or fragments thereof as set out in claims 12-13, and a diagnostic 
composition as set out in claim 14. a 

E Ik 6 PreS6nt inVenti ° n Pr ° VideS 3 meth0d ° f de,ecting a ce " P^'iteration disorder as set out in claims 15-18 

Sm 20 pres ,nvention provides an an,isense sequence as set out in claim 19 and a rib02 y me as set out in 

[0011] The present invention provides a therapeutic composition as set out in claim 21 and the use of an antibody 
or fragment thereof, an antisense sequence or a ribozyme, as set out in claims 22-38. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
[0012] 

5 FIGURE 1 is a Northern blot showing expression of GDF-8 mRNA in adult tissues. The probe was a partial murine 

GDF-8 clone. 

FIGURE 2 shows nucleotide and predicted amino acid sequences of murine GDF-8 (FIGURE 2a) and human 
GDF-8 (FIGURE 2b). The putative dibasic processing sites in the murine sequence are boxed. 

w 

FIGURE 3 shows the alignment of the C-terminal sequences of GDF-8 with other members of the TGF-p super- 
family. The conserved cysteine residues are boxed. Dashes denote gaps introduced in order to maximize align- 
ment. 

15 FIGURE 4 shows amino acid homologies among different members of the TGF-p superfamily. Numbers represent 

percent amino acid identities between each pair calculated from the first conserved cysteine to the C-terminus. 
Boxes represent homologies among highly-related members within particular subgroups. 

FIGURE 5 shows the sequence of GDF-8. Nucleotide and amino acid sequences of murine (FIGURE 5a) and 
20 human (FIGURE 5b) GDF-8 cDNA clones are shown. Numbers indicate nucleotide position relative to the 5' end. 

Consensus N-linked glycosylation signals are shaded. The putative RXXR proteolytic cleavage sites are boxed. 

FIGURE 6 shows a hydropathicity profile of GDF-8. Average hydrophobicity values for murine (FIGURE 6a) and 
human (FIGURE 6b) GDF-8 were calculated using the method of Kyte and Doolittle (J. Mol. Biol., 157:105-132, 
25 1982). Positive numbers indicate increasing hydrophobicity. 

FIGURE 7 shows a comparison of murine and human GDF-8 amino acid sequences. The predicted murine se- 
quence is shown in the top lines and the predicted human sequence is shown in the bottom lines. Numbers indicate 
amino acid position relative to the N-terminus. Identities between the two sequences are denoted by a vertical line. 

30 

FIGURE 8 shows the expression of GDF-8 in bacteria. BL21 (DE3) (pLysS) cells carrying a pRSET/GDF-8 ex- 
pression plasmid were induced with isopropylthio-p-galactoside, and the GDF-8 fusion protein was purified by 
metal chelate chromatography. Lanes: total=total cell lysate; soluble=soluble protein fraction; insoluble=insoluble 
protein fraction (resuspended in 10 mM Tris pH 8.0, 50 mM sodium phosphate, 8 M urea, and 10 mM fi-mercap- 
35 toethanol [buffer B]) loaded onto the column; pellet=insoluble protein fraction discarded before loading the column; 

flowthrough=proteins not bound by the column; washes=washes carried out in buffer B at the indicated pH's. 
Positions of molecular weight standards are shown at the right. Arrow indicates the position of the GDF-8 fusion 
protein. 

40 FIGURE 9 shows the expression of GDF-8 in mammalian cells. Chinese hamster ovary ceils were transfected with 

pMSXND/GDF-8 expression plasmids and selected in G418. Conditioned media from G41 8-resistant cells (pre- 
pared from cells transfected with constructs in which GDF-8 was cloned in either the antisense or sense orientation) 
were concentrated, electrophoresed under reducing conditions, blotted, and probed with anti-GDF-8 antibodies 
and [ 125 l]iodoproteinA. Arrow indicates the position of the processed GDF-8 protein. 

45 

FIGURE 1 0 shows the expression of GDF-8 mRNA. Poly A-selected RNA (5 ng each) prepared from adult tissues 
(FIGURE 10a) or placentas and embryos (FIGURE 10b) at the indicated days of gestation was electrophoresed 
on formaldehyde gels, blotted, and probed with full length murine GDF-8. 

50 FIGURE 11 shows chromosomal mapping of human GDF-8. DNA samples prepared from human/rodent somatic 

cell hybrid lines were subjected to PCR, electrophoresed on agarose gels, blotted, and probed. The human chro- 
mosome contained in each of the hybrid cell lines is identified at the top of each of the first 24 lanes (1-22, X, and 
Y). In the lanes designated M, CHO, and H, the starting DNA template was total genomic DNA from mouse, hamster, 
and human sources, respectively. In the lane marked B1 , no template DNA was used. Numbers at left indicate the 

55 mobilities of DNA standards. 
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DETAILED DESCRIPTION OF THE INVENTION 

[0013] The present invention provides a growth and differentiation factor, GDF-8 and a polynucleotide sequence 
encoding GDF-8. GDF-8 is expressed at highest levels in muscle and at lower levels in adipose tissue. In one embod- 
5 iment, the invention provides a method for detection of a cell proliferative disorder of muscle, nerve, or fat origin which 
is associated with GDF-8 expression. In another embodiment, the invention provides a method for treating a cell pro- 
liferative disorder by using an agent which suppresses or enhances GDF-8 activity. 

[0014] The TGF-p superfamily consists of multifunctional polypeptides that control proliferation, differentiation, and 
other functions in many cell types. Many of the peptides have regulatory, both positive and negative, effects on other 
w peptide growth factors. The structural homology between the GDF-8 protein of this invention and the members of the 
TGF-p family, indicates that GDF-8 is a new member of the family of growth and differentiation factors. Based on the 
known activities of many of the other members, it can be expected that GDF-8 will also possess biological activities 
that will make it useful as a diagnostic and therapeutic reagent. 

[0015] In particular, certain members of this superfamily have expression patterns or possess activities that relate 

15 to the function of the nervous system. For example, the inhibins and activins have been shown to be expressed in the 
brain (Meunier, et at.. Proc. Natl. Acad. ScL, USA, 85:247, 1988; Sawchenko, et al., Nature, 334:615, 1988), and activin 
has been shown to be capable of functioning as a nerve cell survival molecule (Schubert, et al., Nature, 344:868, 1 990). 
Another family member, namely, GDF-1, is nervous system-specific in its expression pattern (Lee, S.J., Proc. Natl. 
Acad. Sci., USA, 88:4250, 1991), and certain other family members, such as Vgr-1 (Lyons, et al., Proc. Natl Acad. Sci., 

20 USA, 86:4554, 1989; Jones, et al., Development, VM:531 , 1991), OP-1 (Ozkaynak, et al., J. Biol. Chem., 267:25220, 
1992), and BMP-4 (Jones, et al., Development, 111_:531 , 1991), are also known to be expressed in the nervous system. 
Because it is known that skeletal muscle produces a factor or factors that promote the survival of motor neurons (Brown, 
Trends Neurosci., 7:10, 1984), the expression of GDF-8 in muscle suggests that one activity of GDF-8 may be as a 
trophic factor for neurons. In this regard, GDF-8 may have applications in the treatment of neurodegenerative diseases, 

25 such as amyotrophic lateral sclerosis, or in maintaining cells or tissues in culture prior to transplantation. 

[0016] GDF-8 may also have applications in treating disease processes involving muscle, such as in musculode- 
generative diseases or in tissue repair due to trauma. In this regard, many other members of the TGF-p family are also 
important mediators of tissue repair. TGF-p has been shown to have marked effects on the formation of collagen and 
to cause a striking angiogenic response in the newborn mouse (Roberts, et al., Proc. Natl. Acad. Sci., USA 83:4167, 

30 1986). TGF-p has also been shown to inhibit the differentiation of myoblasts in culture (Massague, et al., Proc. Natl. 
Acad. Sci., USA 83:8206, 1986). Moreover, because myoblast cells may be used as a vehicle for delivering genes to 
muscle for'gene therapy, the properties of GDF-8 could be exploited for maintaining cells prior to transplantation or for 
enhancing the efficiency of the fusion process. 

[0017] The expression of GDF-8 in adipose tissue also raises the possibility of applications for GDF-8 in the treatment 

35 of obesity or of disorders related to abnormal proliferation of adipocytes. In this regard, TGF-p has been shown to be 
a potent inhibitor of adipocyte differentiation in vitro (lgnotz and Massague, Proc. Natl. Acad. Sci., USA 82:8530, 1 985). 
[0018] The term "substantially pure" as used herein refers to GDF-8 which is substantially free of other proteins, 
lipids, carbohydrates or other materials with which it is naturally associated. One skilled in the art can purify GDF-8 
using standard techniques for protein purification. The substantially pure polypeptide will yield a single major band on 

40 a non-reducing polyacrylamide gel. The purity of the GDF-8 polypeptide can also be determined by amino-terminal 
amino acid sequence analysis. GDF-8 polypeptide includes functional fragments of the polypeptide, as long as the 
activity of GDF-8 remains. Smaller peptides containing the biological activity of GDF-8 are included in the invention. 
[0019] The invention provides polynucleotides encoding the GDF-8 protein. These polynucleotides include DNA, 
cDNA and RNA sequences which encode GDF-8. It is understood that all polynucleotides encoding all or a portion of 

45 GDF-8 are also included herein, as set out in claims 1 -4 as long as they encode a polypeptide with GDF-8 activity. 
Such polynucleotides include naturally occurring, synthetic, and intentionally manipulated polynucleotides. For exam- 
ple, GDF-8 polynucleotide may be subjected to site-directed mutagenesis. The polynucleotide sequence for GDF-8 
also includes antisense sequences. The polynucleotides of the invention include sequences that are degenerate as a 
result of the genetic code. There are 20 natural amino acids, most of which are specified by more than one codon. 

so Therefore, all degenerate nucleotide sequences are included in the invention as long as the amino acid sequence of 
GDF-8 polypeptide encoded by the nucleotide sequence is functionally unchanged. 

[0020] Specifically disclosed herein is a genomic DNA sequence containing a portion of the GDF-8 gene. The se- 
quence contains an open reading frame corresponding to the predicted C-terminal region of the GDF-8 precursor 
protein. The encoded polypeptide is predicted to contain two potential proteolytic processing sites (KR and RR). Cleav- 
55 age of the precursor at the downstream site would generate a mature biologically active C-terminal fragment of 109 
amino acids with a predicted molecular weight of approximately 12,400. Also, disclosed are full length murine and 
human GDF-8 cDNA sequences. The murine pre-pro-GDF-8 protein is 376 amino acids in length, which is encoded 
by a 2676 base pair nucleotide sequence, beginning at nucleotide 1 04 and extending to a TG A stop codon at nucleotide 
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1232. The human GDF-8 protein is 375 amino acids and is encoded by a 2743 base pair sequence, with the open 
reading frame beginning at nucleotide 59 and extending to nucleotide 1184. 

[0021] The C-terminal region of GDF-8 following the putative proteolytic processing site shows significant homology 
to the known members of the TGF-{* superfamily. The GDF-8 sequence contains most of the residues that are highly 
5 conserved in other family members (see FIGURE 3). Like the TGF-ps and inhibin ps, GDF-8 contains an extra pair of 
cysteine residues in addition to the 7 cysteines found in virtually all other family members. Among the known family 
members, GDF-8 is most homologous to Vgr-1 (45% sequence identity) (see FIGURE 4). 

[0022] Minor modifications of the recombinant GDF-8 primary amino acid sequence may result in proteins which 
have substantially equivalent activity as compared to the GDF-8 polypeptide described herein. Such modifications may 

w be deliberate, as by site-directed mutagenesis, or may be spontaneous. All of the polypeptides produced by these 
modifications are included herein as long as the biological activity of GDF-8 still exists. Further, deletion of one or more 
amino acids can also result in a modification of the structure of the resultant molecule without significantly altering its 
biological activity. This can lead to the development of a smaller active molecule which would have broader utility. For 
example, one can remove amino or carboxy terminal amino acids which are not required for GDF-8 biological activity. 

15 [0023] The nucleotide sequence encoding the GDF-8 polypeptide of the invention includes the disclosed sequence 
and conservative variations thereof. The term "conservative variation" as used herein denotes the replacement of an 
amino acid residue by another, biologically similar residue. Examples of conservative variations include the substitution 
of one hydrophobic residue such as isoleucine, valine, leucine or methionine for another, or the substitution of one 
polar residue for another, such as the substitution of arginine for lysine, glutamic for aspartic acid, or glutamine for 

20 asparagine, and the like. The term "conservative variation" also includes the use of a substituted amino acid in place 
of an unsubstituted parent amino acid provided that antibodies raised to the substituted polypeptide also immunoreact 
with the unsubstituted polypeptide. 

[0024] DNA sequences of the invention can be obtained by several methods. For example, the DNA can be isolated 
using hybridization techniques which are well known in the art. These include, but are not limited to: 1) hybridization 

25 of genomic or cDNA libraries with probes to detect homologous nucleotide sequences, 2) polymerase chain reaction 
(PCR) on genomic DNA or cDNA using primers capable of annealing to the DNA sequence of interest, and 3) antibody 
screening of expression libraries to detect cloned DNA fragments with shared structural features. 
[0025] Preferably the GDF-8 polynucleotide of the invention is derived from a mammalian organism, and most pref- 
erably from a mouse, rat, or human. Screening procedures which rely on nucleic acid hybridization make it possible 

30 to isolate any gene sequence from any organism, provided the appropriate probe is available. Oligonucleotide probes, 
which correspond to a part of the sequence encoding the protein in question, can be synthesized chemically. This 
requires that short, oligopeptide stretches of amino acid sequence must be known. The DNA sequence encoding the 
protein can be deduced from the genetic code, however, the degeneracy of the code must be taken into account. It is 
possible to perform a mixed addition reaction when the sequence is degenerate. This includes a heterogeneous mixture 

35 of denatured double-stranded DNA. For such screening, hybridization is preferably performed on either single-stranded 
DNA or denatured double-stranded DNA. Hybridization is particularly useful in the detection of cDNA clones derived 
from sources where an extremely low amount of mRNA sequences relating to the polypeptide of interest are present. 
In other words, by using stringent hybridization conditions directed to avoid non-specific binding, it is possible, for 
example, to allow the autoradiographic visualization of a specific cDNA clone by the hybridization of the target DNA 
to that single probe in the mixture which is its complete complement (Wallace, et al., Nucl. Acid Res., 9:879, 1981). 
[0026] The development of specific DNA sequences encoding GDF-8 can also be obtained by: 1 ) isolation of double- 
stranded DNA sequences from the genomic DNA; 2) chemical manufacture of a DNA sequence to provide the neces- 
sary codons for the polypeptide of interest; and 3) in vitro synthesis of a double-stranded DNA sequence by reverse 
transcription of mRNA isolated from a eukaryotic donor cell. In the latter case, a double-stranded DNA complement of 

45 mRNA is eventually formed which is generally referred to as cDNA. 

[0027] Of the three above-noted methods for developing specific DNA sequences for use in recombinant procedures, 
the isolation of genomic DNA isolates is the least common. This is especially true when it is desirable to obtain the 
microbial expression of mammalian polypeptides due to the presence of introns. 

[0028] The synthesis of DNA sequences is frequently the method of choice when the entire sequence of amino acid 
50 residues of the desired polypeptide product is known. When the entire sequence of amino acid residues of the desired 
polypeptide is not known, the direct synthesis of DNA sequences is not possible and the method of choice is the 
synthesis of cDNA sequences. Among the standard procedures for isolating cDNA sequences of interest is the forma- 
tion of plasmid- or phage-carrying cDNA libraries which are derived from reverse transcription of mRNA which is abun- 
dant in donor cells that have a high level of genetic expression. When used in combination with polymerase chain 
55 reaction technology, even rare expression products can be cloned. In those cases where significant portions of the 
amino acid sequence of the polypeptide are known, the production of labeled single or double-stranded DNA or RNA 
probe sequences duplicating a sequence putatively present in the target cDNA may be employed in DNA/DNA hybrid- 
ization procedures which are carried out on cloned copies of the cDNA which have been denatured into a single- 
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stranded form (Jay, et a!., Nucl. Acid Res., 11:2325, 1983). 

[0029] A cDNA expression library, such as lambda gt11 , can be screened indirectly for GDF-8 peptides having at 
least one epitope, using antibodies specific for GDF-8. Such antibodies can be either polyclonal^ or monoclonally 
derived and used to detect expression product indicative of the presence of GDF-8 cDNA. 

5 [0030] DNA sequences encoding GDF-8 can be expressed in vitro by DNA transfer into a suitable host cell. "Host 
cells" are cells in which a vector can be propagated and its DNA expressed. The term also includes any progeny of 
the subject host cell. It is understood that all progeny may not be identical to the parental cell since there may be 
mutations that occur during replication. However, such progeny are included when the term "host cell" is used. 
[0031] Methods of stable transfer, meaning that the foreign DNA is continuously maintained in the host, are known 

10 in the art. 

[0032] In the present invention, the GDF-8 polynucleotide sequences may be inserted into a recombinant expression 
vector. The term "recombinant expression vector" refers to a plasmid, virus or other vehicle known in the art that has 
been manipulated by insertion or incorporation of the GDF-8 genetic sequences. Such expression vectors contain a 
promoter sequence which facilitates the efficient transcription of the inserted genetic sequence of the host. The ex- 

15 pression vector typically contains an origin of replication, a promoter, as well as specific genes which allow phenotypic 
selection of the transformed cells. Vectors suitable for use in the present invention include, but are not limited to the 
T7-based expression vector for expression in bacteria (Rosenberg, et al., Gene, 56:125, 1987), the pMSXND expres- 
sion vector for expression in mammalian cells (Lee and Nathans, J. Biol. Chem., 263:3521, 1988) and baculovirus- 
derived vectors for expression in insect cells. The DNA segment can be present in the vector operably linked to reg- 

20 ulatory elements, for example, a promoter (e.g., T7, metal loth ion e in I, or polyhedrin promoters). 

[0033] Polynucleotide sequences encoding GDF-8 can be expressed in either prokaryotes or eukaryotes. Hosts can 
include microbial, yeast, insect and mammalian organisms. Methods of expressing DNA sequences having eukaryotic 
or viral sequences in prokaryotes are well known in the art. Biologically functional viral and plasmid DNA vectors 
capable of expression and replication in a host are known in the art. Such vectors are used to incorporate DNA se- 

25 quences of the invention. Preferably, the mature C-terminal region of GDF-8 is expressed from a cDNA clone containing 
the entire coding sequence of GDF-8. Alternatively, the C-terminal portion of GDF-8 can be expressed as a fusion 
protein with the pro- region of another member of the TGF-p family or co-expressed with another pro- region (see for 
example, Hammonds, et al., Molec. Endocrin. 5:149, 1991; Gray, A., and Mason, A., Science, 247:1328, 1990). 
[0034] Transformation of a host cell with recombinant DNA may be carried out by conventional techniques as are 

30 well known to those skilled in the art. Where the host is prokaryotic, such as E. coli, competent cells which are capable 
of DNA uptake can be prepared from cells harvested after exponential growth phase and subsequently treated by the 
CaCI 2 method using procedures well known in the art. Alternatively, MgCI 2 or RbCI can be used. Transformation can 
also be performed after forming a protoplast of the host cell if desired. 

[0035] When the host is a eukaryote, such methods of transfection of DNA as calcium phosphate co-precipitates, 
35 conventional mechanical procedures such as microinjection, electroporation, insertion of a plasmid encased in lipo- 
somes, or virus vectors may be used. Eukaryotic cells can also be cotransformed with DNA sequences encoding the 
GDF-8 of the invention, and a second foreign DNA molecule encoding a selectable phenotype, such as the herpes 
simplex thymidine kinase gene. Another method is to use a eukaryotic viral vector, such as simian virus 40 (SV40) or 
bovine papilloma virus, to transiently infect or transform eukaryotic cells and express the protein, (see for example, 
40 Eukaryotic Viral Vectors, Cold Spring Harbor Laboratory, Gluzman ed M 1 982). 

[0036] Isolation and purification of microbial expressed polypeptide, or fragments thereof, provided by the invention, 
may be carried out by conventional means including preparative chromatography and immunological separations in- 
volving monoclonal or polyclonal antibodies. 

[0037] The invention includes antibodies immunoreactive with GDF-8 polypeptide or functional fragments thereof. 

45 Antibody which consists essentially of pooled monoclonal antibodies with different epitopic specificities, as well as 
distinct monoclonal antibody preparations are provided. Monoclonal antibodies are made from antigen containing frag- 
ments of the protein by methods well known to those skilled in the art (Kohler, et al., Nature, 256:495, 1 975). The term 
antibody as used in this invention is meant to include intact molecules as well as fragments thereof, such as Fab and 
F(ab') 2 , which are capable of binding an epitopic determinant on GDF-8. 

so [0038] The term "cell-proliferative disorder" denotes malignant as well as non-malignant cell populations which often 
appear to differ from the surrounding tissue both morphologically and genotypically. Malignant cells (i.e. cancer) develop 
as a result of a multistep process. The GDF-8 polynucleotide that is an antisense molecule is useful in treating malig- 
nancies of the various organ systems, particularly, for example, cells in muscle or adipose tissue. Essentially, any 
disorder which is etiological^ linked to altered expression of GDF-8 could be considered susceptible to treatment with 

55 a GDF-8 suppressing reagent. One such disorder is a malignant cell proliferative disorder, for example. 

[0039] The invention provides a method for detecting a cell proliferative disorder of muscle or adipose tissue which 
comprises contacting an anti-GDF-8 antibody with a cell suspected of having a GDF-8 associated disorder and de- 
tecting binding to the antibody. The antibody reactive with GDF-8 is labeled with a compound which allows detection 
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of binding to GDF-8. For purposes of the invention, an antibody specific for GDF-8 polypeptide may be used to detect 
the level of GDF-8 in biological fluids and tissues. Any specimen containing a detectable amount of antigen can be 
used. A preferred sample of this invention is muscle tissue. The level of GDF-8 in the suspect cell can be compared 
with the level in a normal cell to determine whether the subject has a GDF-8-associated cell proliferative disorder. 

5 Preferably the subject is human. 

[0040] The antibodies of the invention can be used in any subject in which it is desirable to administer in vitro or in 
vivo immunodiagnosis or immunotherapy. The antibodies of the invention are suited for use, for example, in immu- 
noassays in which they can be utilized in liquid phase or bound to a solid phase carrier. In addition, the antibodies in 
these immunoassays can be detectably labeled in various ways. Examples of types of immunoassays which can utilize 

10 antibodies of the invention are competitive and non-competitive immunoassays in either a direct or indirect format. 
Examples of such immunoassays are the radioimmunoassay (RIA) and the sandwich (immunometric) assay. Detection 
of the antigens using the antibodies of the invention can be done utilizing immunoassays which are run in either the 
forward, reverse, or simultaneous modes, including immunohistochemical assays on physiological samples. Those of 
skill in the art will know, or can readily discern, other immunoassay formats without undue experimentation. 

15 [0041] The antibodies of the invention can be bound to many different carriers and used to detect the presence of 
an antigen comprising the polypeptide of the invention. Examples of well-known carriers include glass, polystyrene, 
polypropylene, polyethylene, dextran, nylon, amylases, natural and modified celluloses, polyacrylamides, agaroses 
and magnetite. The nature of the carrier can be either soluble or insoluble for purposes of the invention. Those skilled 
in the art will know of other suitable carriers for binding antibodies, or will be able to ascertain such, using routine 

20 experimentation. 

[0042] There are many different labels and methods of labeling known to those of ordinary skill in the art. Examples 
of the types of labels which can be used in the present invention include enzymes, radioisotopes, fluorescent com- 
pounds, colloidal metals, chemiluminescent compounds, phosphorescent compounds, and bioluminescent com- 
pounds. Those of ordinary skill in the art will know of other suitable labels for binding to the antibody, or will be able to 

25 ascertain such, using routine experimentation. 

[0043] Another technique which may also result in greater sensitivity consists of coupling the antibodies to low mo- 
lecular weight haptens. These haptens can then be specifically detected by means of a second reaction. For example, 
it is common to use such haptens as biotin, which reacts with avidin, or dinitrophenyl, puridoxal, and fluorescein, which 
can react with specific anti-hapten antibodies. 

30 [0044] In using the monoclonal antibodies of the invention for the in vivo detection of antigen, the detectably labeled 
antibody is given a dose which is diagnostically effective. The term "diagnostically effective" means that the amount 
of detectably labeled monoclonal antibody is administered in sufficient quantity to enable detection of the site having 
the antigen comprising a polypeptide of the invention for which the monoclonal antibodies are specific. 
[0045] The concentration of detectably labeled monoclonal antibody which is administered should be sufficient such 

35 that the binding to those cells having the polypeptide is detectable compared to the background. Further, it is desirable 
that the detectably labeled monoclonal antibody be rapidly cleared from the circulatory system in order to give the best 
target-to-background signal ratio. 

[0046] As a rule, the dosage of detectably labeled monoclonal antibody for in vivo diagnosis will vary depending on 
such factors as age, sex, and extent of disease of the individual. Such dosages may vary, for example, depending on 

40 whether multiple injections are given, antigenic burden, and other factors known to those of skill in the art. 

[0047] For in vivo diagnostic imaging, the type of detection instrument available is a major factor in selecting a given 
radioisotope. The radioisotope chosen must have a type of decay which is detectable for a given type of instrument. 
Still another important factor in selecting a radioisotope for in vivo diagnosis is that deleterious radiation with respect 
to the host is minimized. Ideally, a radioisotope used for in vivo imaging will lack a particle emission, but produce a 

45 large number of photons in the 140-250 keV range, which may readily be detected by conventional gamma cameras. 
[0048] For in vivo diagnosis radioisotopes may be bound to immunoglobulin either directly or indirectly by using an 
intermediate functional group. Intermediate functional groups which often are used to bind radioisotopes which exist 
as metallic ions to immunoglobulins are the bifunctional chelating agents such as diethylenetriaminepentacetic acid 
(DTPA) and ethylenediaminetetraacetic acid (EDTA) and similar molecules. Typical examples of metallic ions which 

50 can be bound to the monoclonal antibodies of the invention are 111 In^Ru^Ga.^Ga, 72 As, 89 Zr,and 201 TI. 

[0049] The monoclonal antibodies of the invention can also be labeled with a paramagnetic isotope for purposes of 
in vivo diagnosis, as in magnetic resonance imaging (MRI) or electron spin resonance (ESR). In general, any conven- 
tional method for visualizing diagnostic imaging can be utilized. Usually gamma and positron emitting radioisotopes 
are used for camera imaging and paramagnetic isotopes for MRI. Elements which are particularly useful in such tech- 

55 niques include 157 Gd, 55 Mn, 162 Dy, 52 Cr,and 56 Fe. 

[0050] The monoclonal antibodies of the invention can be used in vitro and in vivo to monitor the course of amelio- 
ration of a GDF-8-associated disease in a subject. Thus, for example, by measuring the increase or decrease in the 
number of cells expressing antigen comprising a polypeptide of the invention or changes in the concentration of such 
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antigen present in various body fluids, it would be possible to determine whether a particular therapeutic regimen aimed 
at ameliorating the GDF-8-associated disease is effective. The term "ameliorate" denotes a lessening of the detrimental 
effect of the GDF-8-associated disease in the subject receiving therapy. 

[0051] The present invention identifies a nucleotide sequence that can be expressed in an altered manner as com- 
pared to expression in a normal cell, therefore it is possible to design appropriate therapeutic or diagnostic techniques 
directed to this sequence. Thus, where a cell-proliferative disorder is associated with the expression of GDF-8, nucleic 
acid sequences that interfere with GDF-8 expression at the translational level can be used. This approach utilizes, for 
example, antisense nucleic acid and ribozymes to block translation of a specific GDF-8 mRNA, either by masking that 
mRNA with an antisense nucleic acid or by cleaving it with a ribozyme. Such disorders include neurodegenerative 
diseases, for example. 

[0052] Antisense nucleic acids are DNA or RNA molecules that are complementary to at least a portion of a specific 
mRNA molecule (Weintraub, Scientific American, 262:40, 1990). In the cell, the antisense nucleic acids hybridize to 
the corresponding mRNA, forming a double-stranded molecule. The antisense nucleic acids interfere with the trans- 
lation of the mRNA, since the cell will not translate a mRNA that is double-stranded. Antisense oligomers of about 15 
nucleotides are preferred, since they are easily synthesized and are less likely to cause problems than larger molecules 
when introduced into the target GDF-8-producing cell. The use of antisense methods to inhibit the in vitro translation 
of genes is well known in the art (Marcus-Sakura, Anal. Biochem., 172:289, 1988). 

[0053] Ribozymes are RNA molecules possessing the ability to specifically cleave other single-stranded RNA in a 
manner analogous to DNA restriction endonucleases. Through the modification of nucleotide sequences which encode 
these RNAs, it is possible to engineer molecules that recognize specific nucleotide sequences in an RNA molecule 
and cleave it (Cech, J.Amer.Med. Assn., 260:3030, 1988). A major advantage of this approach is that, because they 
are sequence-specific, only mRNAs with particular sequences are inactivated. 

[0054] There are two basic types of ribozymes namely, tetrahymena-type (Hasselhoff, Nature, 334:585, 1988) and 
"hammerhead"-type. Tetrahymena-type ribozymes recognize sequences which are four bases in length, while "ham- 
merhead"-type ribozymes recognize base sequences 11-18 bases in length. The longer the recognition sequence, the 
greater the likelihood that the sequence will occur exclusively in the target mRNA species. Consequently, hammerhead- 
type ribozymes are preferable to tetrahymena-type ribozymes for inactivating a specific mRNA species and 18-based 
recognition sequences are preferable to shorter recognition sequences. 

[0055] The present invention also provides gene therapy for the treatment of cell proliferative or immunologic disor- 
ders which are mediated by GDF-8 protein. Such therapy would achieve its therapeutic effect by introduction of the 
GDF-8 antisense polynucleotide into cells having the proliferative disorder. Delivery of antisense GDF-8 polynucleotide 
can be achieved using a recombinant expression vector such as a chimeric virus or a colloidal dispersion system. 
Especially preferred for therapeutic delivery of antisense sequences is the use of targeted liposomes. 
[0056] Various viral vectors which can be utilized for gene therapy as taught herein include adenovirus, herpes virus, 
vaccinia, or, preferably, an RNA virus such as a retrovirus. Preferably, the retroviral vector is a derivative of a murine 
or avian retrovirus. Examples of retroviral vectors in which a single foreign gene can be inserted include, but are not 
limited to: Moloney murine leukemia virus (MoMuLV), Harvey murine sarcoma virus (HaMuSV), murine mammary 
tumor virus (MuMTV), and Rous Sarcoma Virus (RS V). A number of additional retroviral vectors can incorporate multiple 
genes. All of these vectors can transfer or incorporate a gene for a selectable marker so that transduced cells can be 
identified and generated. By inserting a GDF-8 sequence of interest into the viral vector, along with another gene which 
encodes the ligand for a receptor on a specific target cell, for example, the vector is now target specific. Retroviral 
vectors can be made target specific by attaching, for example, a sugar, a glycolipid, or a protein. Preferred targeting 
is accomplished by using an antibody to target the retroviral vector. Those of skill in the art will know of, or can readily 
ascertain without undue experimentation, specific polynucleotide sequences which can be inserted into the retroviral 
genome or attached to a viral envelope to allow target specific delivery of the retroviral vector containing the GDF-8 
antisense polynucleotide. 

[0057] Since recombinant retroviruses are defective, they require assistance in order to produce infectious vector 
particles. This assistance can be provided, for example, by using helper cell lines that contain plasmids encoding all 
of the structural genes of the retrovirus under the control of regulatory sequences within the LTR. These plasmids are 
missing a nucleotide sequence which enables the packaging mechanism to recognize an RNA transcript for encapsi- 
dation. Helper cell lines which have deletions of the packaging signal include, but are not limited to ¥2, PA317 and 
PA12, for example. These cell lines produce empty virions, since no genome is packaged. If a retroviral vector is 
introduced into such cells in which the packaging signal is intact, but the structural genes are replaced by other genes 
of interest, the vector can be packaged and vector virion produced. 

[0058] Alternatively, NIH 3T3 or other tissue culture cells can be directly transfected with plasmids encoding the 
retroviral structural genes gag, pol and env, by conventional calcium phosphate transfection. These cells are then 
transfected with the vector plasmid containing the genes of interest. The resulting ceils release the retroviral vector 
into the culture medium. 
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[0059] Another targeted delivery system for GDF-8 antisense polynucleotides is a colloidal dispersion system. Col- 
loidal dispersion systems include macromolecule complexes, nanocapsules, microspheres, beads, and lipid-based 
systems including oil-in-water emulsions, micelles, mixed micelles, and liposomes. The preferred colloidal system of 
this invention is a liposome. Liposomes are artificial membrane vesicles which are useful as delivery vehicles in vitro 

5 and in vivo. It has been shown that large unilamellar vesicles (LUV), which range in size from 0.2-4.0 um can encap- 
sulate a substantial percentage of an aqueous buffer containing large macromolecules. RNA, DNA and intact virions 
can be encapsulated within the aqueous interior and be delivered to cells in a biologically active form (Fraley, et al., 
Trends Biochem. Sci., 6:77, 1981). In addition to mammalian cells, liposomes have been used for delivery of polynu- 
cleotides in plant, yeast and bacterial cells. In order for a liposome to be an efficient gene transfer vehicle, the following 

10 characteristics should be present: (1 ) encapsulation of the genes of interest at high efficiency while not compromising 
their biological activity; (2) preferential and substantial binding to a target cell in comparison to non-target cells; (3) 
delivery of the aqueous contents of the vesicle to the target cell cytoplasm at high efficiency; and (4) accurate and 
effective expression of genetic information (Mannino, et al., Biotechniques, 6:682, 1988). 

[0060] The composition of the liposome is usually a combination of phospholipids, particularly high-phase-transttion- 
is temperature phospholipids, usually in combination with steroids, especially cholesterol. Other phospholipids or other 
lipids may also be used. The physical characteristics of liposomes depend on pH, ionic strength, and the presence of 
divalent cations. 

[0061] Examples of lipids useful in liposome production include phosphatidyl compounds, such as phosphatidylg- 
lycerol, phosphatidylcholine, phosphatidylserine, phosphatidylethanolamine, sphingolipids, cerebrosides, andganglio- 
20 sides. Particularly useful are diacylphosphatidylglycerois, where the lipid moiety contains from 14-18 carbon atoms, 
particularly from 16-18 carbon atoms, and is saturated. Illustrative phospholipids include egg phosphatidylcholine, 
dipalmitoylphosphatidylcholine and distearoylphosphatidylcholine. 

[0062] The targeting of liposomes can be classified based on anatomical and mechanistic factors. Anatomical clas- 
sification is based on the level of selectivity, for example, organ-specific, cell-specific, and organelle-specific. Media- 
ns nistic targeting can be distinguished based upon whether it is passive or active. Passive targeting utilizes the natural 
tendency of liposomes to distribute to cells of the reticuloendothelial system (RES) in organs which contain sinusoidal 
capillaries. Active targeting, on the other hand, involves alteration of the liposome by coupling the liposome to a specific 
ligand such as a monoclonal antibody, sugar, glycolipid, or protein, or by changing the composition or size of the 
liposome in order to achieve targeting to organs and cell types other than the naturally occurring sites of localization. 
30 [0063] The surface of the targeted delivery system may be modified in a variety of ways. In the case of a liposomal 
targeted delivery system, lipid groups can be incorporated into the lipid bilayer of the liposome in order to maintain the 
targeting ligand in stable association with the liposomal bilayer. Various linking groups can be used for joining the lipid 
chains to the targeting ligand. 

[0064] Due to the expression of GDF-8 in muscle and adipose tissue, there are a variety of applications using the 
35 polypeptide, polynucleotide, and antibodies of the invention, related to these tissues. Such applications include treat- 
ment of cell proliferative disorders involving these and other tissues, such as neural tissue. In addition, GDF-8 may be 
useful in various gene therapy procedures. 

[0065] The data in Example 6 shows that the human GDF-8 gene is located on chromosome 2. By comparing the 
chromosomal location of GDF-8 with the map positions of various human disorders, it should be possible to determine 

40 whether mutations in the GDF-8 gene are involved in the etiology of human diseases. For example, an autosomal 
recessive form of juvenile amyotrophic lateral sclerosis has been shown to map to chromosome 2 (Hentati, et al., 
Neurology, 42 [Suppl.3]:201 , 1992). More precise mapping of GDF-8 and analysis of DNA from these patients may 
indicate that GDF-8 is, in fact, the gene affected in this disease. In addition, GDF-8 is useful for distinguishing chro- 
mosome 2 from other chromosomes. 

45 [0066] The following examples are intended to illustrate but not limit the invention. While they are typical of those 
that might be used, other procedures known to those skilled in the art may alternatively be used. 

EXAMPLE 1 

50 IDENTIFICATION AND ISOLATION OF A NOVEL TGF-fl FAMILY MEMBER 

[0067] To identify a new member of the TGF-p superfamily, degenerate oligonucleotides were designed which cor- 
responded to two conserved regions among the known family members: one region spanning the two tryptophan res- 
idues conserved in all family members except MIS and the other region spanning the invariant cysteine residues near 
55 the C-terminus. These primers were used for polymerase chain reactions on mouse genomic DNA followed by sub- 
cloning the PCR products using restriction sites placed at the 5' ends of the primers, picking individual E. coli colonies 
carrying these subcloned inserts, and using a combination of random sequencing and hybridization analysis to eliminate 
known members of the superfamily. 
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[0068] GDF-8 was identified from a mixture of PCR products obtained with the primers 



SJL141 : 5 , -CCGGAATTCGGITGG(G/C/A)A(G//VT/C)(A/G)A(T/C)TGG(A/G)TI 
(A/G)TI{T/G)CICC-3' (SEQ ID NO:1) 

SJL147: 5'-CCGGAATTC(G/A)CAI(G/C)C(G/A)CA(G/A)CT(G/An-/C) 
TCIACI(G/A)(T/C)CAT-3' (SEQ ID NO:2) 

[0 °. 6 ™/« CR US ' n9 th6Se PrimerS WaS Carried out wi,h 2 W mouse 9 enomic DNA at 94°C for 1 min, 50°C for 2 min 
and 72°C for 2 min for 40 cycles. ' 

[0070] PCR products of approximately 280 bp were gel-purified, digested with Eco Rl, gel-purified again and sub- 

' h Q 6 fi BIU f CriP< T°\ (Strata9ene ' San Die9 °' CA >- b *«°™ colonies carrying individual subclones were 

. ™ * m fu ,lter P ' a,eS ' and mUltip ' e replicas were prepared b V P |alin 9 the cells onto nitrocellulose. The 
replicate filters were hybridized to probes representing known members of the family, and DNA was prepared from 
non-hybridizing colonies for sequence analysis. 

KLJI" Primer combina,ion of SJL141 a "d SJL147, encoding the amino acid sequences GW(H/Q/N/K/D/EHD/ 
N)W(V/I/M)(V/I/M)(A/S)P (SEQ ID NO:9) and M(V/./M/T/A)V(D/E)SC(G/A)C (SEQ ID NO: 10), resSXySd ou 

nZTrn. '? ' S ??n enCeS (BMFM - inhibin PB ' GDF " 3 and GDF - 5 > and one novel se <™ which was desig- 
nated GDF-8, among 110 subclones analyzed. 

[0072] Human GDF-8 was isolated using the primers: 

ACM 13: S'-CGCGGATCCAGAAGTCAAGGTGACAGACACAC-S 1 (SEQ ID NO:3); 

and 



ACM14: 5'-CGCGGATCCTCCTCATGAGCACCCACAGCGGTC-3' (SEQ ID N0:4) 

[0 °Il ] n / CR US ' nS th6Se PrimefS WaS Carried 0ut with one W> human genomic DNA at 94°C for 1 min 58°C for 2 min 

B^Sin, ZlT^l 30 CyC c S - 1"" PCR Pr ° dUCt WaS di9eS,ed Wi,h Bam Hl ' 9el-purified, and subcloned in the 
Bluescript vector (Stratagene, San Francisco, CA). 

EXAMPLE 2 

EXPRESSION PATTERN AND SEQUENCE OF GDF-8 

S?I!L T k t ,e r ine the , expression P attern of GDF - 8 . R NA samples prepared from a variety of adult tissues were 
screened by Northern analysis. RNA isolation and Northern analysis were carried out as described previously (Lee 
S.-J., Mol. Endocrinol., 4:1034, 1 990) except that hybridization was carried out in 5X SSPE, 10% dextran sulfate 50% 
formam.de, 1 % SDS, 200 ng/m. salmon DNA, and 0.1 % each of bovine serum albumin, ficol and po.yvinylpyrroHdone 
Five micrograms of twice poly A-selected RNA prepared from each tissue (except for muscle, for which only 2 ug RNA 
was used) were electrophoresed on formaldehyde gels, blotted, and probed with GDF-8. As shown in FIGURE 1 the 
in" adiX 0 tissue teCted * ^ expressed at highest levels in muscle and at significantly lower levels 

from »L r°n?»tn a ' Qe J ° f ^ ° DF " 8 96ne> 3 m0USe gen0mic librar V was screened with a probe derived 
from the GDF-8 PCR product. The partial sequence of a GDF-8 genomic clone is shown in FIGURE 2a. The sequence 
contains an open reading frame corresponding to the predicted C-terminal region of the GDF-8 precursor protein The 
predated GDF-8 sequence contains two potential proteolytic processing sites, which are boxed. Cleavage of the pre- 

Zrt^rT* ° f '[If 8 ? Si,6S W0U ' d 9enerate 3 matUre C - terminal ,ra9ment 109 amino acids in length with a 
predicted molecular weight of 12,400. The partial sequence of human GDF-8 is shown in FIGURE 2b. Assuming no 

regbnTe C ?ooT°denr U 1 9 *" iS0 ' a,i ° n * ^ C '° ne ' ** ^ ^ amin ° add Sequences in ,his 
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[0076] The C-terminal region of GDF-8 following the putative proteolytic processing site shows significant homology 
to the known members of the TGF-p superfamily (FIGURE 3). FIGURE 3 shows the alignment of the C-terminal se- 
quences of GDF-8 with the corresponding regions of human GDF-1 (Lee, Proc. Natl. Acad. Sci. USA, 88:4250-4254, 
1991), human BMP-2 and 4 (Wozney, et al. f Science, 242:1528-1534, 1988), human Vgr-1 (Celeste, et al., Proc. Natl. 

5 Acad. Sci. USA, 87:9843-9847, 1990), human OP-1 (Ozkaynak, et al., EMBO J., 9:2085-2093, 1990), human BMP-5 
(Celeste, et al., Proc. Natl. Acad. Sci. USA, 87:9843-9847, 1990), human BMP-3 (Wozney, et al., Science, 242 : 
1528-1534, 1988), human MIS (Cate, et al., Cell, 45:685-698, 1986), human inhibin alpha, pA, and (3B (Mason, et al., 
Biochem, Biophys. Res. Commun., 135:957-964, 1986), human TGF-pi (Derynck, etal., Nature, 316:701-705, 1985), 
humanTGF-p2 (deMartin, et al., EMBO J M 6:3673-3677, 1 987), and human TGF-|33 (ten Dijke, et a!., Proc. Natl. Acad. 

10 Sci. USA, 85:471 5-471 9, 1 988). The conserved cysteine residues are boxed. Dashes denote gaps introduced in order 
to maximize the alignment. 

[0077] GDF-8 contains most of the residues that are highly conserved in other family members, including the seven 
cysteine residues with their characteristic spacing. Like the TGF-ps and inhibin ps, GDF-8 also contains two additional 
cysteine residues. In the case of TGF-P2, these two additional cysteine residues are known to form an intramolecular 
15 disulfide bond (Daopin, et al., Science, 257:369, 1992; Schlunegger and Grutter, Nature, 358:430, 1992). 

[0078] FIGURE 4 shows the amino acid homologies among the different members of the TGF-p superfamily. Numbers 
represent percent amino acid identities between each pair calculated from the first conserved cysteine to the C-termi- 
nus. Boxes represent homologies among highly-related members within particular subgroups. In this region, GDF-8 
is most homologous to Vgr-1 (45% sequence identity). 

20 

EXAMPLE 3 

ISOLATION OF cDNA CLONES ENCODING MURINE AND HUMAN GDF-8 

25 [0079] In order to isolate full-length cDNA clones encoding murine and human GDF-8, cDNA libraries were prepared 
in the lambda ZAP II vector (Stratagene) using RNA prepared from skeletal muscle. From 5 jig of twice poly A-selected 
RNA prepared from murine and human muscle, cDNA libraries consisting of 4.4 million and 1.9 million recombinant 
phage, respectively, were constructed according to the instructions provided by Stratagene. These libraries were 
screened without amplification. Library screening and characterization of cDNA inserts were carried out as described 

30 previously (Lee, Mol. Endocrinol, 4:1034-1040). 

[0080] From 2.4 x 1 0 6 recombinant phage screened from the murine muscle cDNA library, greater than 280 positive 
phage were identified using a murine GDF-8 probe derived from a genomic clone, as described in Example 1. The 
entire nucleotide sequence of the longest cDNA insert analyzed is shown in FIGURE 5a and SEQ ID NO: 11 . The 2676 
base pair sequence contains a single long open reading frame beginning with a methionine codon at nucleotide 104 

35 and extending to a TGA stop codon at nucleotide 1 232. Upstream of the putative initiating methionine codon is an in- 
frame stop codon at nucleotide 23. The predicted pre-pro-GDF-8 protein is 376 amino acids in length. The sequence 
contains a core of hydrophobic amino acids at the N-terminus suggestive of a signal peptide for secretion (FIGURE 
6a), one potential N-glycosylation site at asparagine 72, a putative RXXR proteolytic cleavage site at amino acids 
264-267, and a C-terminal region showing significant homology to the known members of the TGF-p superfamily. 
Cleavage of the precursor protein at the putative RXXR site would generate a mature C-terminal GDF-8 fragment 109 
amino acids in length with a predicted molecular weight of approximately 12,400. 

[0081] From 1.9 x 10 6 recombinant phage screened from the human muscle cDNA library, 4 positive phage were 
identified using a human GDF-8 probe derived by polymerase chain reaction on human genomic DNA. The entire 
nucleotide sequence of the longest cDNA insert is shown in FIGURE 5b and SEQ ID NO:13. The 2743 base pair 

4 5 sequence contains a single long open reading frame beginning with a methionine codon at nucleotide 59 and extending 
to a TGA stop codon at nucleotide 1184. The predicted pre-pro-GDF-8 protein is 375 amino acids in length. The se- 
quence contains a core of hydrophobic amino acids at the N-terminus suggestive of a signal peptide for secretion 
(FIGURE 6b), one potential N-glycosylation site at asparagine 71, and a putative RXXR proteolytic cleavage site at 
amino acids 263-266. FIGURE 7 shows a comparison of the predicted murine (top) and human (bottom) GDF-8 amino 

50 acid sequences. Numbers indicate amino acid position relative to the N-terminus. Identities between the two sequences 
are denoted by a vertical line. Murine and human GDF-8 are approximately 94% identical in the predicted pro-regions 
and 100% identical following the predicted RXXR cleavage sites. 

EXAMPLE 4 

55 

PREPARATION OF ANTIBODIES AGAINST GDF-8 AND EXPRESSION OF GDF-8 IN MAMMALIAN CELLS 
[0082] In order to prepare antibodies against GDF-8, GDF-8 antigen was expressed as a fusion protein in bacteria. 
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A portion of murine GDF-8 cDNA spanning amino acids 268-376 (mature region) was inserted into the pRSET vector 
(Invitrogen) such that the GDF-8 coding sequence was placed in frame with the initiating methionine codon present in 
the vector; the resulting construct created an open reading frame encoding a fusion protein with a molecular weight of 
approximately 16,600. The fusion construct was transformed into BL21 (DE3) (pLysS) cells, and expression of the 

5 fusion protein was induced by treatment with isopropylthio-P-galactoside as described (Rosenberg, et al.. Gene, 56: 
1 25-1 35). The fusion protein was then purified by metal chelate chromatography according to the instructions provided 
by Invitrogen. A Coomassie blue-stained gel of unpurified and purified fusion proteins is shown in FIGURE 8. 
[0083] The purified fusion protein was used to immunize both rabbits and chickens. Immunization of rabbits was 
carried out by Spring Valley Labs (Sykesville, MD), and immunization of chickens was carried out by HRP, Inc. (Denver, 

10 PA). Western analysis of sera both from immunized rabbits and from immunized chickens demonstrated the presence 
of antibodies directed against the fusion protein. 

[0084] To express GDF-8 in mammalian cells, the murine GDF-8 cDNA sequence from nucleotides 48-1303 was 
cloned in both orientations downstream of the metallothionein I promoter in the pMSXND expression vector; this vector 
contains processing signals derived from SV40, a dihydrofolate reductase gene, and a gene conferring resistance to 
is the antibiotic G41 8 (Lee and Nathans, J. Biol. Chem., 263:3521 -3527). The resulting constructs were transfected into 
Chinese hamster ovary cells, and stable tranfectants were selected in the presence of G418. Two milliliters of condi- 
tioned media prepared from the G418-resistant cells were dialyzed, lyophilized, electrophoresed under denaturing, 
reducing conditions, transferred to nitrocellulose, and incubated with anti-GDF-8 antibodies (described above) and 
[ 125 l]iodoproteinA. 

20 [0085] As shown in FIGURE 9, the rabbit GDF-8 antibodies (at a 1 :500 dilution) detected a protein of approximately 
the predicted molecular weight for the mature C-terminal fragment of GDF-8 in the conditioned media of cells trans- 
fected with a construct in which GDF-8 had been cloned in the correct (sense) orientation with respect to the metal- 
lothionein promoter (lane 2); this band was not detected in a similar sample prepared from cells transfected with a 
control antisense construct (lane 1 ). Similar results were obtained using antibodies prepared in chickens. Hence, GDF- 

25 8 is secreted and proteolytically processed by these transfected mammalian cells. 

EXAMPLE 5 

EXPRESSION PATTERN OF GDF-8 

30 

[0086] To determine the pattern of GDF-8, 5 ug of twice poly A-selected RNA prepared from a variety of murine 
tissue sources were subjected to Northern analysis. As shown in FIGURE 10a (and as shown previously in Example 
2), the GDF-8 probe detected a single mRNA species present almost exclusively in skeletal muscle among a large 
number of adult tissues surveyed. On longer exposures of the same blot, significantly lower but detectable levels of 
35 GDF-8 mRNA were seen in fat, brain, thymus, heart, and lung. Hence, these results confirm the high degree of spe- 
cificity of GDF-8 expression in skeletal muscle. GDF-8 mRNA was also detected in mouse embryos at both gestational 
ages (day 12.5 and day 1 8.5 post-coital) examined but not in placentas at various stages of development (FIGURE 1 0b). 

EXAMPLE 6 

40 

CHROMOSOMAL LOCALIZATION OF GPF-8 

[0087] In order to map the chromosomal location of GDF-8, DNA samples from human/rodent somatic cell hybrids 
(Drwinga, et al., Genomics, 16:311-413, 1993; Dubois and Naylor, Genomics, 16:315-319, 1993) were analyzed by 

45 polymerase chain reaction followed by Southern blotting. Polymerase chain reaction was carried out using primer #83, 
5'-CGCGGATCCGTGGATCTAAATGAGAACAGTGAGC-3' (SEQ ID NO:15) and primer #84, 5 , -CGCGAATTCTCAGG- 
TAATGATTGTTTCCGTTGTAGCG-3'(SEQ ID NO:16) for 40 cycles at 94°C for 2 minutes, 60°C for 1 minute, and 72°C 
for 2 minutes. These primers correspond to nucleotides 119 to 143 (flanked by a Bam H1 recognition sequence), and 
nucleotides 394 to 418 (flanked by an Eco R1 recognition sequence), respectively, in the human GDF-8 cDNA se- 

50 quence. PCR products were electrophoresed on agarose gels, blotted, and probed with oligonucleotide #100, 
S'-ACACTAAATCTTCAAGAATA-S' (SEQ ID NO: 17), which corresponds to a sequence internal to the region flanked 
by primer #83 and #84. Filters were hybridized in 6 X SSC, 1 X Denhardt's solution, 1 00ng/mi yeast transfer RNA, and 
0.05% sodium pyrophosphate at 50°C. 

[0088] As shown in FIGURE 11 , the human-specific probe detected a band of the predicted size (approximately 320 
55 base pairs) in the positive control sample (total human genomic DNA) and in a single DNA sample from the human/ 
rodent hybrid panel. This positive signal corresponds to human chromosome 2. The human chromosome contained 
in each of the hybrid cell lines is identified at the top of each of the first 24 lanes (1-22, X, and Y). In the lanes designated 
M, CHO, and H, the starting DNA template was total genomic DNA from mouse, hamster, and human sources, respec- 
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tively. In the lane marked B1 , no template DNA was used. Numbers at left indicate the mobilities of DNA standards. 
These data show that the human GDF-8 gene is located on chromosome 2. 

SUMMARY OF SEQUENCES 

5 

[0089] 

SEQ ID NO: 1 is the nucleic acid sequence for clone SJL141 . 
10 SEQ ID NO: 2 is the nucleic acid sequence for clone SJL147. 

SEQ ID NO: 3 is the nucleic acid sequence for clone ACM13. 
SEQ ID NO: 4 is the nucleic acid sequence for clone ACM 14. 

15 

SEQ ID NO: 5 is the partial nucleotide sequence and deduced amino acid sequence for murine GDF-8. 
SEQ ID NO: 6 is the deduced partial amino acid sequence for murine GDF-8. 
20 SEQ ID NO: 7 is the partial nucleotide sequence and deduced amino acid sequence for human GDF-8. 

SEQ ID NO: 8 is the deduced partial amino acid sequence for human GDF-8. 
SEQ ID NO: 9 is the amino acid sequence for primer SJL141 . 

25 

SEQ ID NO: 10 is the amino acid sequence for primer SJL147. 

SEQ ID NO: 11 is the nucleotide and deduced amino acid sequence for murine GDF-8. 
30 SEQ ID NO: 12 is the deduced amino acid sequence for murine GDF-8. 

SEQ ID NO: 13 is the nucleotide and deduced amino acid sequence for human GDF-8. 
SEQ ID NO: 14 is the deduced amino acid sequence for human GDF-8. 

35 

SEQ ID NO's: 15 and 16 are nucleotide sequences for primer #83 and #84, respectively, which were used to map 
human GDF-8 in human/rodent somatic cell hybrids. 

SEQ ID NO:17 is the nucleotide sequence of oligonucleotide #100 which corresponds to a sequence internal to 
40 the region flanked by primer #83 and #84. 

SEQUENCE LISTING 

[0090] 

45 

(1) GENERAL INFORMATION: 

(i) APPLICANT: THE JOHNS HOPKINS UNIVERSITY 
50 (ii) TITLE OF INVENTION: GROWTH DIFFERENTIATION FACTOR-8 

(iii) NUMBER OF SEQUENCES: 17 

(iv) CORRESPONDENCE ADDRESS: 

55 

(A) ADDRESSEE: Spensley Horn Jubas & Lubitz 

(B) STREET: 1880 Century Park East - Suite 500 

(C) CITY: Los Angeles 
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(D) STATE: California 

(E) COUNTRY: USA 

(F) ZIP: 90067 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT 

(B) FILING DATE: 18-MAR-1994 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Wetherell, Jr., Ph.D., John R., 

(B) REGISTRATION NUMBER: 31,678 

(C) REFERENCE/DOCKET NUMBER: FD-3413 CIP PCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (619) 455-5100 

(B) TELEFAX: (619) 455-5110 

(2) INFORMATION FOR SEQ ID NO:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: SJL141 

(ix) FEATURE: 

(A) NAME/KEY: modified_base 

(B) LOCATION: 1..35 

(D) OTHER INFORMATION: /mod_base= i 
/note= M "B" is defined as "I" (inosine)" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: 



CCGGAATTCG GBTGGVANRA YTGGRTBRTB 
35 

(2) INFORMATION FOR SEQ ID NO:2: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
10 (vii) IMMEDIATE SOURCE: 

(B) CLONE: SJL147 
(ix) FEATURE: 

15 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..33 

(ix) FEATURE: 

20 

(A) NAME/KEY: modified.base 

(B) LOCATION: 1..33 

(D) OTHER INFORMATION: /mod_base= i 
/note= m, B" is defined as "I" (inosine)" 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

CCGGAATTCR CABSCRCARC TNTCBACBRY CAT 

30 33 

(2) INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

35 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: DNA (genomic) 
(vii) IMMEDIATE SOURCE: 

45 (B) CLONE: ACM13 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
50 (B) LOCATION: 1..32 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 



CGCGGATCCA GAAGTCAAGG TGACACACAC AC 
32 

(2) INFORMATION FOR SEQ ID NO:4: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: ACM 14 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..33 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

CGCGGATCCT CCTCATGAGC ACCCACAGCG GTC 
33 

(2) INFORMATION FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 550 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: mouse GDF-8 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 59.. 436 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 
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TTAAGGTAGG aaggatttca ggctctattt acataattgt tctttccttt tcacacag 
58 

AAT CCC TTT TTA GAA GTC AAG GTG ACA GAC ACA CCC AAG AGG TCC CGG 
106 

Asn Pro Phe Leu Glu Val Lys Val Thr Asp Thr Pro Lys Arg Ser Arg 
1 5 10 15 

AGA GAC TTT CGG CTT GAC TGC GAT GAG CAC TCC ACG GAA TCC CGG TGC 
154 

Arg Asp Phe Gly Leu Asp Cys Asp Glu His Ser Thr Glu Ser Arg Cys 
20 25 30 

TGC CGC TAC CCC CTC ACG GTC GAT TTT GAA GCC TTT GGA TGG GAC TGG 
202 
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Cys Arg Tyr Pro Leu Thr Val Asp Phe Glu Ala Phe Gly Trp Asp Trp 
35 40 45 

ATT ATC GCA CCC AAA AGA TAT AAG GCC AAT TAC TGC TCA GGA GAG TGT 
250 

He He Ala Pro Lys Arg Tyr Lys Ala Asn Tyr Cys Ser Gly Glu Cys 
50 55 60 

GAA TTT GTG TTT TTA CAA AAA TAT CCG CAT ACT CAT CTT GTG CAC CAA 
298 

Glu Phe Val Phe Leu Gin Lys Tyr Pro His Thr His Leu Val His Gin 
65 70 75 80 

GCA AAC CCC AGA GGC TCA GCA GGC CCT TGC TGC ACT CCG ACA AAA ATG 
346 

Ala Asn Pro Arg Gly Ser Ala Gly Pro Cys Cys Thr Pro Thr Lys Met 
85 90 95 

TCT CCC ATT AAT ATG CTA TAT TTT AAT GGC AAA GAA CAA ATA ATA TAT 
394 

Ser Pro He Asn Met Leu Tyr Phe Asn Gly Lys Glu Gin He He Tyr 
100 105 110 

GGG AAA ATT CCA GCC ATG GTA GTA GAC CGC TGT GGG TGC TCA 
436 

Gly Lys He Pro Ala Met Val Val Asp Arg Cys Gly Cys Ser 
115 120 125 

TGAGCTTTGC ATTACGTTAG AAACTTCCCA AGTCATGGAA GGTCTTCCCC TCAATTTCGA 
496 

AACTGTGAAT TCCTGCAGCC CGGGGGATCC ACTAGTTCTA GAGCGGCCGC CACC 
550 

(2) INFORMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 126 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 
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Asn Pro Phe Leu Glu Val Lys Val Thr Asp Thr Pro Lys Arg Ser Arg 
1 5 10 15 

Arg Asp Phe Gly Leu Asp Cys Asp Glu His Ser Thr Glu Ser Arg Cys 
20 25 30 

Cys Arg Tyr Pro Leu Thr Val Asp Phe Glu Ala Phe Gly Trp Asp Trp 
35 AO 45 

lie He Ala Pro Lys Arg Tyr Lys Ala Asn Tyr Cys Ser Gly Glu Cys 
50 55 60 

Glu Phe Val Phe Leu Gin Lys Tyr Pro His Thr His Leu Val His Gin 
65 70 75 80 

Ala Asn Pro Arg Gly Ser Ala Gly Pro Cys Cys Thr Pro Thr Lys Met 
85 90 95 

Ser Pro He Asn Met Leu Tyr Phe Asn Gly Lys Glu Gin He He Tyr 
100 105 HO 

Gly Lys He Pro Ala Met Val Val Asp Arg Cys Gly Cys Ser 
115 120 125 

(2) INFORMATION FOR SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 326 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: human GDF-8 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION; 3..326 

(xt) SEQUENCE DESCRIPTION: SEQ ID NO:7: 
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CA AAA AGA TCC AGA AGG GAT TTT GGT CTT GAC TGT GAT GAG CAC TCA 
47 

Lys Arg Ser Arg Arg Asp Phe Gly Leu Asp Cys Asp Glu His Ser 
1 5 iO 15 

ACA GAA TCA CGA TGC TGT CGT TAC CCT CTA ACT GTG GAT TTT GAA GCT 
95 

Thr Glu Ser Arg Cys Cys Arg Tyr Pro Leu Thr Val Asp Phe Glu Ala 
20 25 30 

TTT GGA TGG GAT TGG ATT ATC GCT CCT AAA AGA TAT AAG GCC AAT TAC 
143 

Phe Gly Trp Asp Trp lie lie Ala Pro Lys Arg Tyr Lys Ala Asn Tyr 
35 40 ' 45 

TGC TCT GGA GAG TGT GAA TTT GTA TTT TTA CAA AAA TAT CCT CAT ACT 
191 

Cys Ser Gly Glu Cys Glu Phe Val Phe Leu Gin Lys Tyr Pro His Thr 
50 55 60 

CAT CTG GTA CAC CAA GCA AAC CCC AGA GGT TCA GCA GGC CCT TGC TGT 
239 

His Leu Val His Gin Ala Asn Pro Arg Gly Ser Ala Gly Pro Cys Cys 
65 70 75 

ACT CCC ACA AAG ATG TCT CCA ATT AAT ATG CTA TAT TTT AAT GGC AAA 
287 

Thr Pro Thr Lys Met Ser Pro He Asn Met Leu Tyr Phe Asn Gly Lys 
80 85 90 95 

GAA CAA ATA ATA TAT GGG AAA ATT CCA GCG ATG GTA GTA 
326 

Glu Gin He He Tyr Gly Lys He Pro Ala Met Val Val 
100 105 

(2) INFORMATION FOR SEQ ID NO:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 
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Lys Arg Ser Arg Arg Asp Phe Gly Leu Asp Cys Asp Glu His Ser Thr 
1 5 10 15 

Glu Ser Arg Cys Cys Arg Tyr Pro Leu Thr Val Asp Phe Glu Ala Phe 
20 25 30 

Gly Trp Asp Trp He He Ala Pro Lys Arg Tyr Lys Ala Asn Tyr Cys 
35 AO 45 

Ser Gly Glu Cys Glu Phe Val Phe Leu Gin Lys Tyr Pro His Thr His 
50 55 60 

Leu Val His Gin Ala Asn Pro Arg Gly Ser Ala Gly Pro Cys Cys Thr 
65 70 75 80 

Pro Thr Lys Met Ser Pro He Asn Met Leu Tyr Phe Asn Gly Lys Glu 
85 90 95 

Gin He He Tyr Gly Lys He Pro Ala Het Val Val 
100 105 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: SJL141 
(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..9 

(D) OTHER INFORMATION: /note= "His = His, Asn, Lys, Asp or Glu; Asp = Asp or Asn; Val = Val, He 
Met; Ala = Ala or Ser." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 



Gly Trp His Asp Trp Val Val Ala Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO:10: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: SJL147 
(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..8 

(D) OTHER INFORMATION: /note= "lie = lie, Val, Met, Thr or Ala; Asp = Asp or Glu; Gly = Gly or Ala." 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 



Met lie Val Asp Ser Cys Gly Cys 
1 5 



(2) INFORMATION FOR SEQ ID NO:11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2676 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: Murine GDF-8 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 104.. 1231 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: 



50 



55 
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GTCTCTCGGA CGGTACATGC ACTAATATTT CACTTGGCAT TACTCAAAAG CAAAAAGAAG 
60 

AAATAAGAAC AAGGGAAAAA AAAAGATTGT GCTGATTTTT AAA ATG ATG CAA AAA 
115 

Met Met Gin Lys 
1 

CTG CAA ATG TAT GTT TAT ATT TAC CTG TTC ATG CTG ATT GCT GCT GGC 
163 

Leu Gin Met Tyr Val Tyr lie Tyr Leu Phe Met Leu lie Ala Ala Gly 
5 10 15 20 

CCA GTG GAT CTA AAT GAG GGC AGT GAG AGA GAA GAA AAT GTG GAA AAA 
211 

Pro Val Asp Leu Asn Glu Gly Ser Glu Arg Glu Glu Asn Val Glu Lys 
25 30 35 

GAG GGG CTG TGT AAT GCA TGT GCG TGG AGA CAA AAC ACG AGG TAC TCC 
259 

Glu Gly Leu Cys Asn Ala Cys Ala Trp Arg Gin Asn Thr Arg Tyr Ser 
40 45 50 

AGA ATA GAA GCC ATA AAA ATT CAA ATC CTC AGT AAG CTG CGC CTG GAA 
307 

Arg lie Glu Ala He Lys He Gin He Leu Ser Lys Leu Arg Leu Glu 
55 60 65 

ACA GCT CCT AAC ATC AGC AAA GAT GCT ATA AGA CAA CTT CTG CCA AGA 
355 

Thr Ala Pro Asn He Ser Lys Asp Ala He Arg Gin Leu Leu Pro Arg 
70 75 80 

GCG CCT CCA CTC CGG GAA CTG ATC GAT CAG TAC GAC GTC CAG AGG GAT 
403 

Ala Pro Pro Leu Arg Glu Leu He Asp Gin Tyr Asp Val Gin Arg Asp 
85 90 95 100 
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GAC AGC AGT GAT GGC TCT TTG GAA GAT GAC GAT TAT CAC GCT ACC ACG 
451 

Asp Ser Ser Asp Gly Ser Leu Glu Asp Asp Asp Tyr His Ala Thr Thr 
105 110 115 

GAA ACA ATC ATT ACC ATG CCT ACA GAG TCT GAC TTT CTA ATG CAA GCG 
499 

Glu Thr lie lie Thr Met Pro Thr Glu Ser Asp Phe Leu Met Gin Ala 
120 125 130 

GAT GGC AAG CCC AAA TGT TGC TTT TTT AAA TTT AGC TCT AAA ATA CAG 
547 

Asp Gly Lys Pro Lys Cys Cys Phe Phe Lys Phe Ser Ser Lys lie Gin 
135 140 145 

TAC AAC AAA GTA GTA AAA GCC CAA CTG TGG ATA TAT CTC AGA CCC GTC 
595 

Tyr Asn Lys Val Val Lys Ala Gin Leu Trp lie Tyr Leu Arg Pro Val 
150 155 160 

AAG ACT CCT ACA ACA GTG TTT GTG CAA ATC CTG AGA CTC ATC AAA CCC 
643 

Lys Thr Pro Thr Thr Val Phe Val Gin lie Leu Arg Leu lie Lys Pro 
165 170 175 180 

ATG AAA GAC GGT ACA AGG TAT ACT GGA ATC CGA TCT CTG AAA CTT GAC 
691 

Met Lys Asp Gly Thr Arg Tyr Thr Gly lie Arg Ser Leu Lys Leu Asp 
185 190 195 

ATG AGC CCA GGC ACT GGT ATT TGG CAG AGT ATT GAT GTG AAG ACA GTG 
739 

Met Ser Pro Gly Thr Gly lie Trp Gin Ser He Asp Val Lys Thr Val 
200 205 210 

TTG CAA AAT TGG CTC AAA CAG CCT GAA TCC AAC TTA GGC ATT GAA ATC 
787 

Leu Gin Asn Trp Leu Lys Gin Pro Glu Ser Asn Leu Gly He Glu He 
215 220 225 

AAA GCT TTG GAT GAG AAT GGC CAT GAT CTT GCT GTA ACC TTC CCA GGA 
835 

Lys Ala Leu Asp Glu Asn Gly His Asp Leu Ala Val Thr Phe Pro Gly 
230 235 240 
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CCA GCA GAA GAT GGG CTG AAT CCC TTT TTA GAA GTC AAG GTG ACA GAC 
883 

Pro Gly Glu Asp Gly Leu Asn Pro Phe Leu Glu Val Lys Val Thr Asp 
245 250 255 260 

ACA CCC AAG AGG TCC CGG AGA GAC TTT GGG CTT GAC TGC GAT GAG CAC 
931 

Thr Pro Lys Arg Ser Arg Arg Asp Phe Gly Leu Asp Cys Asp Glu His 
265 270 275 

TCC ACG GAA TCC CGG TGC TGC CGC TAC CCC CTC ACG GTC GAT TTT GAA 
979 

Ser Thr Glu Ser Arg Cys Cys Arg Tyr Pro Leu Thr Val Asp Phe Glu 
280 285 290 

GCC TTT GGA TGG GAC TGG ATT ATC GCA CCC AAA AGA TAT AAG GCC AAT 
1027 

Ala Phe Gly Trp Asp Trp lie lie Ala Pro Lys Arg Tyr Lys Ala Asn 
295 300 305 

TAC TGC TCA GGA GAG TGT GAA TTT GTG TTT TTA CAA AAA TAT CCG CAT 
1075 

Tyr Cys Ser Gly Glu Cys Glu Phe Val Phe Leu Gin Lys Tyr Pro His 
310 315 320 

ACT CAT CTT GTG CAC CAA GCA AAC CCC AGA GGC TCA GCA GGC CCT TGC 
1123 

Thr His Leu Val His Gin Ala Asn Pro Arg Gly Ser Ala Gly Pro Cys 
325 330 335 340 

TGC ACT CCG ACA AAA ATG TCT CCC ATT AAT ATG CTA TAT TTT AAT GGC 
1171 

Cys Thr Pro Thr Lys Met Ser Pro lie Asn Met Leu Tyr Phe Asn Gly 
345 350 355 

AAA GAA CAA ATA ATA TAT GGG AAA ATT CCA GCC ATG GTA GTA GAC CGC 
1219 

Lys Glu Gin He He Tyr Gly Lys He Pro Ala Met Val Val Asp Arg 
360 365 370 

TGT GGG TGC TCA TGAGCTTTGC ATTAGGTTAG AAACTTCCCA AGTCATGGAA 
1271 

Cys Gly Cys Ser 
375 
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GGTCTTCCCC TCAATTTCGA AACTGTGAAT TCAAGCACCA CACGCTGTAG GCCTTGAGTA 
1331 

TGCTCTAGTA ACGTAAGCAC AAGCTACAGT GTATGAACTA AAAGAGAGAA TAGATGCAAT 
1391 

GGTTGGCATT CAACCACCAA AATAAACCAT ACTATAGGAT GTTGTATGAT TTCCACAGTr 
1451 * 

TTTGAAATAG ATGGAGATCA AATTACATTT ATGTCCATAT ATCTATATTA CAACTACAAT 
1511 

CTAGGCAAGG AAGTGAGAGC ACATCTTGTG GTCTGCTGAG TTAGGAGGGT ATGATTAAAA 
1571 

GGTAAAGTCT TATTTCCTAA CAGTTTCACT TAATATTTAC AGAAGAATCT ATATGTAGCC 
1631 

TTTGTAAAGT GTAGGATTGT TATCATTTAA AAACATCATG TACACTTATA TTTGTATTGT 
1691 

ATACTTGGTA AGATAAAATT CCACAAAGTA GGAATGGGGC CTCACATACA CATTGCCATT 
1751 

CCTATTATAA TTGGACAATC CACCACGGTG CTAATGCAGT GCTGAATGGC TCCTACTGGA 
1811 

CCTCTCGATA CAACACTCTA CAAAGTACGA GTCTCTCTCT CCCTTCCAGG TGCATCTCCA 
1871 

CACACACAGC ACTAAGTGTT CAATGCATTT TCTTTAAGGA AAGAAGAATC TTTTTTTCTA 
1931 

GAGGTCAACT TTCAGTCAAC TCTAGCACAG CGGGAGTGAC TGCTGCATCT TAAAAGGCAG 
1991 

CCAAACAGTA TTCATTTTTT AATCTAAATT TCAAAATCAC TGTCTGCCTT TATCACATGG 
2051 

CAATTTTGTG GTAAAATAAT GGAAATGACT GGTTCTATCA ATATTGTATA AAAGACTCTG 
2111 

AAACAATTAC ATTTATATAA TATGTATACA ATATTGTTTT GTAAATAAGT GTCTCCTTTT 
2171 
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ATATTTACTT TGGTATATTT TTACACTAAT GAAATTTCAA ATCATTAAAG TACAAAGACA 
2231 

TGTCATGTAT CACAAAAAAG GTGACTGCTT CTATTTCAGA GTGAATTAGC AGATTCAATA 
2291 

GTGGTCTTAA AACTCTGTAT GTTAAGATTA GAAGGTTATA TTACAATCAA TTTATGTATT 
2351 

TTTTACATTA TCAACTTATG GTTTCATGGT GGCTGTATCT ATGAATGTGG CTCCCAGTCA 
2411 

AATTTCAATG CCCCACCATT TTAAAAATTA CAAGCATTAC TAAACATACC AACATGTATC 
2471 

TAAAGAAATA CAAATATGGT ATCTCAATAA CAGCTACTTT TTTATTTTAT AATTTG A CAA 
2531 

TGAATACATT TCTTTTATTT ACTTCAGTTT TATAAATTGG AACTTTGTTT ATCAAATGTA 
2591 

TTGTACTCAT AGCTAAATGA AATTATTTCT TACATAAAAA TGTGTAGAAA CTATAAATTA 
2651 

AAGTGTTTTC ACATTTTTGA AAGGC 
2676 



(2) INFORMATION FOR SEQ ID NO:12: 
(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 376 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 



Met Met Gin Lys Leu Gin Met Tyr Val Tyr lie Tyr Leu Phe Met Leu 

1 5 10 15 

lie Ala Ala Gly Pro Val Asp Leu Asn Glu Gly Ser Glu Arg Glu Glu 

20 25 30 
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Asn Val Glu Lys Glu Gly Leu Cys Asn Ala Cys Ala Trp Arg Gin Asn 
35 40 45 

Thr Arg Tyr Ser Arg He Glu Ala He Lys He Gin He Leu Ser Lys 
50 55 60 

Leu Arg Leu Glu Thr Ala Pro Asn He Ser Lys Asp Ala He Arg Gin 
65 70 75 80 

Leu Leu Pro Arg Ala Pro Pro Leu Arg Glu Leu He Asp Gin Tyr Asp 
85 90 95 

Val Gin Arg Asp Asp Ser Ser Asp Gly Ser Leu Glu Asp Asp Asp Tyr 
100 105 110 

His Ala Thr Thr Glu Thr He He Thr Met Pro Thr Glu Ser Asp Phe 
115 ' 120 125 

Leu Met Gin Ala Asp Gly Lys Pro Lys Cys Cys Phe Phe Lys Phe Ser 
130 135 140 

Ser Lys lie Gin Tyr Asn Lys Val Val Lys Ala Gin Leu Trp He Tyr 
145 150 155 160 

Leu Arg Pro Val Lys Thr Pro Thr Thr Val Phe Val Gin He Leu Arg 
165 170 175 

Leu He Lys Pro Met Lys Asp Gly Thr Arg Tyr Thr Gly He Arg Ser 
180 185 190 

Leu Lys Leu Asp Met Ser Pro Gly Thr Gly He Trp Gin Ser He Asp 
195 200 205 

Val Lys Thr Val Leu Gin Asn Trp Leu Lys Gin Pro Glu Ser Asn Leu 
210 215 220 

Gly He Glu He Lys Ala Leu Asp Glu Asn Gly His Asp Leu Ala Val 
225 230 235 240 

Thr Phe Pro Gly Pro Gly Glu Asp Gly Leu Asn Pro Phe Leu Glu Val 
245 250 255 



Lys Val Thr Asp Thr Pro Lys Arg Ser Arg Arg Asp Phe Gly Leu Asp 
260 265 270 
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Cys Asp Glu His Ser Thr Glu Ser Arg Cys Cys Arg Tyr Pro Leu Thr 
275 280 285 

Val Asp Phe Glu Ala Phe Gly Trp Asp Trp He He Ala Pro Lys Arg 
290 295 300 

Tyr Lys Ala Asn Tyr Cys Ser Gly Glu Cys Glu Phe Val Phe Leu Gin 
305 310 315 320 

Lys Tyr Pro His Thr His Leu Val His Gin Ala Asn Pro Arg Gly Ser 
325 330 335 

Ala Gly Pro Cys Cys Thr Pro Thr Lys Met Ser Pro He Asn Met Leu 
340 345 350 

Tyr Phe Asn Gly Lys Glu Gin He He Tyr Gly Lys He Pro Ala Met 
355 360 365 

Val Val Asp Arg Cys Gly Cys Ser 
370 375 

(2) INFORMATION FOR SEQ ID NO:13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2743 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: Human GDF-8 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 59.. 11 83 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

AACAAAAGTA AAAGGAAGAA ACAAGAACAA GAAAAAAGAT TATATTGATT TTAAAATC 
58 
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ATG CAA AAA CTG CAA CTC TGT GTT TAT ATT TAC CTG TTT ATG CTG ATT 
106 

Met Gin Lys Leu Gin Leu Cys Val Tyr He Tyr Leu Phe Met Leu He 
1 5 10 15 

GTT GOT GGT CCA GTG GAT CTA AAT GAG AAC AGT GAG CAA AAA GAA AAT 
154 

Val Ala Gly Pro Val Asp Leu Asn Glu Asn Ser Glu Gin Lys Glu Asn 
20 25 30 

GTG GAA AAA GAG GGG CTG TGT AAT GCA TGT ACT TGG AGA CAA AAC ACT 
202 

Val Glu Lys Glu Gly Leu Cys Asn Ala Cys Thr Trp Arg Gin Asn Thr 
35 40 45 

AAA TCT TCA AGA ATA GAA GCC ATT AAG ATA CAA ATC CTC AGT AAA CTT 
250 

Lys Ser Ser Arg He Glu Ala He Lys He Gin He Leu Ser Lys Leu 
50 55 60 

CGT CTG GAA ACA GCT CCT AAC ATC AGC AAA GAT GTT ATA AGA CAA CTT 
298 

Arg Leu Glu Thr Ala Pro Asn He Ser Lys Asp Val He Arg Gin Leu 
65 70 75 80 

TTA CCC AAA GCT CCT CCA CTC CGG GAA CTG ATT GAT CAG TAT GAT GTC 
346 

Leu Pro Lys Ala Pro Pro Leu Arg Glu Leu He Asp Gin Tyr Asp Val 
85 90 95 

CAG AGG GAT GAC AGC AGC GAT GGC TCT TTG GAA GAT GAC GAT TAT CAC 
394 

Gin Arg Asp Asp Ser Ser Asp Gly Ser Leu Glu Asp Asp Asp Tyr His 
100 105 110 

GCT ACA ACG GAA ACA ATC ATT ACC ATG CCT ACA GAG TCT GAT TTT CTA 
442 

Ala Thr Thr Glu Thr He He Thr Met Pro Thr Glu Ser Asp Phe Leu 
115 120 125 

ATG CAA GTG GAT GGA AAA CCC AAA TGT TGC TTC TTT AAA TTT AGC TCT 
490 

Met Gin Val Asp Gly Lys Pro Lys Cys Cys Phe Phe Lys Phe Ser Ser 
130 135 140 
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AAA ATA CAA TAC AAT AAA GTA GTA AAG GCC CAA CTA TGG ATA TAT TTG 
538 

Lys lie Gin Tyr Asn Lys Val Val Lys Ala Gin Leu Trp lie Tyr Leu 
145 150 155 160 

AGA CCC GTC GAG ACT CCT ACA ACA GTG TTT GTG CAA ATC CTG AGA CTC 
586 

Arg Pro Val Glu Thr Pro Thr Thr Val Phe Val Gin lie Leu Arg Leu 
165 170 175 

ATC AAA CCT ATG AAA GAC GGT ACA AGG TAT ACT GGA ATC CGA TCT CTG 
634 

He Lys Pro Met Lys Asp Gly Thr Arg Tyr Thr Gly He Arg Ser Leu 
180 185 190 

AAA CTT GAC ATG AAC CCA GGC ACT GGT ATT TGG CAG AGC ATT GAT GTG 
682 

Lys Leu Asp Met Asn Pro Gly Thr Gly He Trp Gin Ser He Asp Val 
195 200 205 

AAG ACA GTG TTG CAA AAT TGG CTC AAA CAA CCT GAA TCC AAC TTA GGC 
730 

Lys Thr Val Leu Gin Asn Trp Leu Lys Gin Pro Glu Ser Asn Leu Gly 
210 215 220 

ATT GAA ATA AAA GCT TTA GAT GAG AAT GGT CAT GAT CTT GCT GTA ACC 
778 

He Glu He Lys Ala Leu Asp Glu Asn Gly His Asp Leu Ala Val Thr 
225 230 235 240 

TTC CCA GGA CCA GGA GAA GAT GGG CTG AAT CCG TTT TTA GAG GTC AAG 
826 

Phe Pro Gly Pro Gly Glu Asp Gly Leu Asn Pro Phe Leu Glu Val Lys 
245 250 255 

GTA ACA GAC ACA CCA AAA AGA TCC AGA AGG GAT TTT GGT CTT GAC TGT 
874 

Val Thr Asp Thr Pro Lys Arg Ser Arg Arg Asp Phe Gly Leu Asp Cys 
260 265 270 

GAT GAG CAC TCA ACA GAA TCA CGA TGC TGT CGT TAC CCT CTA ACT GTG 
922 

Asp Glu His Ser Thr Glu Ser Arg Cys Cys Arg Tyr Pro Leu Thr Val 
275 280 285 
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GAT TTT GAA GCT TTT GGA TGG GAT TGG ATT ATC GCT CCT AAA AGA TAT 
970 

Asp Phe Glu Ala Phe Gly Trp Asp Trp He He Ala Pro Lys Arg Tyr 
290 295 300 

AAG GCC AAT TAC TGC TCT GGA GAG TGT GAA TTT GTA TTT TTA CAA AAA 
1018 

Lys Ala Asn Tyr Cys Ser Gly Glu Cys Glu Phe Val Phe Leu Gin Lys 
305 310 315 320 

TAT CCT CAT ACT CAT CTG GTA CAC CAA GCA AAC CCC AGA GGT TCA GCA 
1066 

Tyr Pro His Thr His Leu Val His Gin Ala Asn Pro Arg Gly Ser Ala 
325 330 335 

GGC CCT TGC TGT ACT CCC ACA AAG ATG TCT CCA ATT AAT ATG CTA TAT 
1114 

Gly Pro Cys Cys Thr Pro Thr Lys Mec Ser Pro He Asn Met Leu Tyr 
340 345 350 

TTT AAT GGC AAA GAA CAA ATA ATA TAT GGG AAA ATT CCA GCG ATG GTA 
1162 

Phe Asn Gly Lys Glu Gin He He Tyr Gly Lys He Pro Ala Met Val 
355 360 365 

GTA GAC CGC TGT GGG TGC TCA TGAGATTTAT ATTAAGCGTT CATAACTTCC 
1213 

Val Asp Arg Cys Gly Cys Ser 
370 375 

TAAAACATGG AAGGTTTTCC CCTCAACAAT TTTGAAGCTG TGAAATTAAG TACCACAGGC 
1273 



TATAGGCCTA GAGTATGCTA CAGTCACTTA AGCATAAGCT ACAGTATCTA AACTAAAAGG 
1333 



GGGAATATAT GCAATGGTTG GCATTTAACC ATCCAAACAA ATCATACAAG AAAGTTTTAT 
1393 



GATTTCCAGA GTTTTTGAGC TAGAAGGAGA TCAAATTACA TTTATGTTCC TATATATTAC 
1453 



AACATCGGCG AGGAAATGAA AGCGATTCTC CTTGAGTTCT GATGAATTAA AGGAGTATGC 
1513 
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TTTAAAGTCT ATTTCTTTAA 
1573 

TGGTAAAATG CAGGATTGTT 
1633 

TATTGTATGG TAGTATACTT 
1693 

ATGCAATTTC CATTCCTATT 
1753 

TAATACGATA GGCTGAATGT 
1813 

AATAGTAAGT TTCTCTTTTC 
1873 

TTTCTTTAAT GTAAGAAGAA 
1933 

TGGAGAAACT GCATTATCTT 
1993 

AAATAACATA CTTGGAGAAG 
2053 

CAACACTGCA GTTTTTATGG 
2113 

AAAGACTGAA ACAATGCATT 
2173 

TCCTTTTTTA TTTACTTTGG 
2233 

GCACAAAGAC ATGTCATGCA 
2293 

AGATTAAATA GTGGTCTTAA 
2353 

TTTTATATTT TTTTACATGA 
2413 
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AGTTTTGTTT AATATTTACA 

ATATACCATC ATTCGAATCA 
GGTAAGATAA AATTCCACAA 
ATAATTGACA CAGTACATTA 
CTGAGGCTAC CAGGTTTATC 
TTCAGGTGCA TTTTCCTACA 
TCATTTTTCT AGAGGTTGGC 
AAAAGGCAGT CAAATGGTGT 
TATGTAATTT TGTCTTTGGA 
TAAAATAATA GAAATGATCG 
TATATAATAT GTATACAATA 
TATATTTTTA CACTAAGGAC 
TCACAGAAAA GCAACTACTT 
AACTCCATAT GTTAATGATT 
TTAACATTCA CTTATGGATT 

33 



GAAAAATCCA CATACAGTAT 
TCCTTAAACA CTTGAATTTA 
AAATAGGGAT GGTGCAGCAT 
ACAATCCATG CCAACGGTGC 
ACATAAAAAA CATTCAGTAA 
CCTCCAAATG AGGAATGGAT 
TTTCAATTCT GTAGCATACT 
TTGTTTTTAT CAAAATGTCA 
AAATTACAAC ACTGCCTTTG 
ACTCTATCAA TATTGTATAA 
TTGTTTTGTA AATAAGTGTC 
ATTTCAAATT AAGTACTAAG 
ATATTTCAGA GCAAATTAGC 
AGATGGTTAT ATTACAATCA 
CATGATGGCT GTATAAAGTG 
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AATTTGAAAT TTCAATGGTT TACTGTCATT GTGTTTAAAT CTCAACGTTC CATTATTTTA 
2473 

ATACTTGCAA AAACATTACT AAGTATACCA AAATAATTGA CTCTATTATC TGAAATGAAG 
2533 



AATAAACTGA TGCTATCTCA ACAATAACTG TTACTTTTAT TTTATAATTT GATAATGAAT 
2593 



ATATTTCTGC ^TTTATTTAC TTCTGTTTTG TAAATTGGGA TTTTGTTAAT CAAATTTATT 
2653 



GTACTATGAC TAAATGAAAT TATTTCTTAC ATCTAATTTG TAGAAACAGT ATAAGTTATA 
2713 

TTAAAGTGTT TTCACATTTT TTTGAAAGAC 
2743 



(2) INFORMATION FOR SEQ ID NO:14: 
(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 375 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 



Met Gin Lys Leu Gin Leu Cys Val Tyr lie Tyr Leu Phe Met Leu lie 
1 5 10 15 

Val Ala Gly Pro Val Asp Leu Asn Glu Asn Ser Glu Gin Lys Glu Asn 
20 25 30 

Val Glu Lys Glu Gly Leu Cys Asn Ala Cys Thr Trp Arg Gin Asn Thr 
35 40 45 

Lys Ser Ser Arg He Glu Ala He Lys He Gin He Leu Ser Lys Leu 
50 55 60 

Arg Leu Glu Thr Ala Pro Asn He Ser Lys Asp Val He Arg Gin Leu 
65 70 75 ° 80 
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Leu Pro Lys Ala Pro Pro Leu Arg Glu Leu lie Asp Gin Tyr Asp Val 
85 90 95 

Gin Arg Asp Asp Ser Ser Asp Gly Ser Leu Glu Asp Asp Asp Tyr His 
100 105 110 

Ala Thr Thr Glu Thr He He Thr Met Pro Thr Glu Ser Asp Phe Leu 
115 120 125 

Met Gin Val Asp Gly Lys Pro Lys Cys Cys Phe Phe Lys Phe Ser Ser 
130 135 140 

Lys He Gin Tyr Asn Lys Val Val Lys Ala Gin Leu Trp lie Tyr Leu 
145 150 155 160 

Arg Pro Val Glu Thr Pro Thr Thr Val Phe Val Gin He Leu Arg Leu 
165 170 175 

He Lys Pro Met Lys Asp Gly Thr Arg Tyr Thr Gly He Arg Ser Leu 
180 185 190 

Lys Leu Asp Met Asn Pro Gly Thr Gly He Trp Gin Ser He Asp Val 
195 200 205 

Lys Thr Val Leu Gin Asn Trp Leu Lys Gin Pro Glu Ser Asn Leu Gly 
210 215 220 

He Glu He Lys Ala Leu Asp Glu Asn Gly His Asp Leu Ala Val Thr 
225 230 235 240 

Phe Pro Gly Pro Gly Glu Asp Gly Leu Asn Pro Phe Leu Glu Val Lys 
245 250 255 

Val Thr Asp Thr Pro Lys Arg Ser Arg Arg Asp Phe Gly Leu Asp Cys 
260 265 270 

Asp Glu His Ser Thr Glu Ser Arg Cys Cys Arg Tyr Pro Leu Thr Val 
275 280 285 

Asp Phe Glu Ala Phe Gly Trp Asp Trp He He Ala Pro Lys Arg Tyr 
290 295 300 



Lys Ala Asn Tyr Cys Ser Gly Glu Cys Glu Phe Val Phe Leu Gin Lys 
305 310 315 320 
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Tyr Pro His Thr His Leu Val His Gin Ala Asn Pro Arg Gly Ser Ala 
325 330 335 

Gly Pro Cys Cys Thr Pro Thr Lys Met Ser Pro He Asn Met Leu Tyr 
340 345 350 

Phe Asn Gly Lys Glu Gin He He Tyr Gly Lys He Pro Ala Met Val 
355 360 365 

Val Asp Arg Cys Gly Cys Ser 
370 375 

(2) INFORMATION FOR SEQ ID NO:15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: #83 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..34 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 

CGCGGATCCG TGGATCTAAA TGAGAACAGT GAGC 
34 

(2) INFORMATION FOR SEQ ID NO:16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: #84 
(ix) FEATURE: 
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(A) NAME/KEY: CDS 

(B) LOCATION: 1..37 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 



CGCGAATTCT CAGGTAATCA TTGTTTCCGT TGTAGCG 
37 

(2) INFORMATION FOR SEQ ID NO:17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: #100 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 



ACACTAAATC TTCAAGAATA 
20 



ANNEX 

SEQUENCE LISTING 
[0091] 

(1) GENERAL INFORMATION 

(i) APPLICANT: John Hopkins University School of Medicine 720 Rutland Avenue, Baltimore, Maryland 21 205, 
United States of America 

(ii) TITLE OF THE INVENTION: GROWTH DIFFERENTIATION FACTOR-8 

(iii) NUMBER OF SEQUENCES: 32 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: Windows95 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(v) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/525,596 

(B) FILING DATE: 19-SEP-1995 

(C) CLASSIFICATION: 
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(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US94/07762 

(B) FILING DATE: 08-JUL-1994 

(2) INFORMATION FOR SEQ ID NO:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: SJL1 41 

(ix) FEATURE: 

(A) NAME/KEY: Modified Base 

(B) LOCATION: 1... 35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: 

CCGGAATTCG GBTGGVANRA YTGGRTBRTB KCBCC 

(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: SJL147 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1...33 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
CCGGAATTCR CABSCRCARC TNTCBACBRY CAT 

(2) INFORMATION FOR SEQ ID NO:3: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: ACM 13 
10 (ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1...32 

(D) OTHER INFORMATION: 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 
CGCGGATCCA GAAGTCAAGG TGACAGACAC AC 32 

20 

(2) INFORMATION FOR SEQ ID NO:4: 
(i) SEQUENCE CHARACTERISTICS: 

25 (A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (jj) MOLECULE TYPE: Genomic DNA 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: ACM 14 

35 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1...33 

40 (D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 



45 CGCGGATCCT CCTCATGAGC ACCCACAGCG GTC 33 

(2) INFORMATION FOR SEQ ID NO:5: 
(i) SEQUENCE CHARACTERISTICS: 

50 

(A) LENGTH: 550 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

55 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: mouse GDF-8 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 59...436 
(D) OTHER INFORMATION: 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO:5: 



TTAAGGTAGG AAGGATTTCA GGCTCTATTT ACATAATTGT TCTTTCCTTT TCACACAG 

AAT CCC TTT TTA GAA GTC AAG GTG ACA GAC ACA CCC AAG AGG TCC CGG 
Asn Pro Phe Leu Glu Val Lys Val Thr Asp Thr Pro Lys Arg Ser Arg 
1 5 10 15 

AGA GAC TTT GGG CTT GAC TGC GAT GAG CAC TCC ACG GAA TCC CGG TGC 
Arg Asp Phe Gly Leu Asp Cys Asp Glu His Ser Thr Glu Ser Arg Cys 
20 25 30 

TGC CGC TAC CCC CTC ACG GTC GAT TTT GAA GCC TTT GGA TGG GAC TGG 
Cys Arg Tyr Pro Leu Thr Val Asp Phe Glu Ala Phe Gly Trp Asp Trp 
35 40 45 

ATT ATC GCA CCC AAA AGA TAT AAG GCC AAT TAC TGC TCA GGA GAG TGT 
He He Ala Pro Lys Arg Tyr Lys Ala Asn Tyr Cys Ser Gly Glu Cys 
50 55 60 

GAA TTT GTG TTT TTA CAA AAA TAT CCG CAT ACT CAT CTT GTG CAC CAA 
Glu Phe Val Phe Leu Gin Lys Tyr Pro His Thr His Leu Val His Gin 
65 70 75 80 

GCA AAC CCC AGA GGC TCA GCA GGC CCT TGC TGC ACT CCG ACA AAA ATG 
Ala Asn Pro Arg Gly Ser Ala Gly Pro Cys Cys Thr Pro Thr Lys Met 
65 90 95 

TCT CCC ATT AAT ATG CTA TAT TTT AAT GGC AAA GAA CAA ATA ATA TAT 
Ser Pro He Asn Met Leu Tyr Phe Asn Gly Lys Glu Gin He He Tyr 
10 <> 105 no 

GGG AAA ATT CCA GCC ATG GTA GTA GAC CGC TGT GGG TGC TCA TGAGCTTTGC 
Gly Lys He Pro Ala Met Val Val Asp Arg Cys Gly Cys Ser 
* 1S 120 125 

ATTAGGTTAG AAACTTCCCA AGTCATGGAA GGTCTTCCCC TCAATTTCGA AACTGTGAAT 
TCCTGCAGCC CGGGGGATCC ACTAGTTCTA GAGCGGCCGC CACC 



(2) INFORMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 126 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 





Asn 


Pro 


Phe 


Leu 


Glu 


Val 


Lys Val 


Thr 




11 IX 


rlO 


xiy a 


arg 


ocx rvx y 


5 


1 
















x u 










1 c 






Asp 


Phe 


Glv 


Leu 


Asp 


Cys Asp 






OCX 


xilx 


f3l 11 
ulu 


30 


/ix y i» 




Cys 


Arg Tyr 


Pro 


Leu 


Thr 


Val Asp 


Phe 


Glu 


Ala 


Phe 


Gly 


Trp 


Asp Trp 








35 








40 










45 






10 


Tip 
lie 


He 
50 


Ala 


tr x \j 


Lys 


Arg 


Tyr Lys 

55 


Ala 


Asn 


Tyr 


Cys 
60 


Ser 


ran 
my 


Glu Cys 




Glu 


Phe 


Val 


Phe 


Leu 


Gin 


Lys Tyr 


Pro 


His 


Thr 


His 


Leu 


Val 


His Gin 




65 










70 








75 








60 




Ala 


Asn 


Pro 


Arg 


Gly 


Ser Ala Gly 


Pro 


Cys 


Cys 


Thr 


Pro 


Thr 


Lys Met 


15 










85 








90 










95 




Ser 


Pro 


He 


Asn 
100 


Met 


Leu 


Tyr Phe 


Asn 
105 


Gly 


Lys 


Glu 


Gin 


He 
110 


He Tyr 




Gly 


Lys 


He 


Pro 


Ala 


Met 


Val Val 


Asp 


Arg 


Cys 


Gly 


Cys 


Ser 





115 120 125 



20 

(2) INFORMATION FOR SEQ ID NO:7: 
(i) SEQUENCE CHARACTERISTICS: 

25 (A) LENGTH: 326 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (vii) IMMEDIATE SOURCE: 

(B) CLONE: human GDF-8 
(ix) FEATURE: 

35 

(A) NAME/KEY: CDS 

(B) LOCATION: 3...326 

(D) OTHER INFORMATION: 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 



CA AAA AGA TCC AGA AGG GAT TTT GGT CTT GAC TGT GAT GAG CAC TCA 47 
Lys Arg Ser Arg Arg Asp Phe Gly Leu Asp Cys Asp Glu His Ser 
45 x 5 10 15 

ACA GAA TCA CGA TGC TGT CGT TAC CCT CTA ACT GTG GAT TTT GAA GCT 95 
Thr Glu Ser Arg Cys Cys Arg Tyr Pro Leu Thr Val Asp Phe Glu Ala 
20 25 30 

50 

TTT GGA TGG GAT TGG ATT ATC GCT CCT AAA AGA TAT AAG GCC AAT TAC 143 
Phe Gly Trp Asp Trp He He Ala Pro Lys Arg Tyr Lys Ala Asn Tyr 



55 
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TGC TCT GGA GAG TGT GAA TTT GTA TTT TTA CAA AAA TAT CCT CAT ACT 191 
Cys Ser Gly Glu Cys Glu Phe Val Phe Leu Gin Lys Tyr Pro His Thr 
50 55 60 

CAT CTG GTA CAC CAA GCA AAC CCC AGA GGT TCA GCA GGC CCT TGC TGT 239 
His Leu Val His Gin Ala Asn Pro Arg Gly Ser Ala Gly Pro Cys Cys 
65 70 75 

ACT CCC ACA AAG ATG TCT CCA ATT AAT ATG CTA TAT TTT AAT GGC AAA 287 
Thr Pro Thr Lys Met Ser Pro lie Asn Met Leu Tyr Phe Asn Gly Lys 
80 85 90 95 

GAA CAA ATA ATA TAT GGG AAA ATT CCA GCG ATG GTA GTA 326 
Glu Gin lie He Tyr Gly Lys He Pro Ala Met Val Val 
100 105 



(2) INFORMATION FOR SEQ ID NO:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 



Lys 


Arg Ser 


Arg 


Arg 


Asp 


Phe Gly Leu Asp 


Cys 


Asp Glu His 


Ser Thr 


1 






5 




10 






15 


Glu 


Ser Arg 


Cys 


Cys 


Arg 


Tyr Pro Leu Thr 


val 


Asp Phe Glu 


Ala Phe 






20 






25 




30 




Gly 


Trp Asp 


Trp 


lie 


lie 


Ala Pro Lys Arg 


Tyr 


Lys Ala Asn 


Tyr Cys 




35 








40 




45 




Ser 


Gly Glu 


Cys 


Glu 


Phe 


Val Phe Leu Gin 


Lys 


Tyr Pro His 


Thr His 




50 








55 




60 




Leu 


Val His 


Gin 


Ala 


Asn 


Pro Arg Giy Ser 


Ala 


Gly Pro Cys 


Cys Thr 


65 








70 




75 




80 


Pro 


Thr Lys 


Met 


Ser 


Pro 


He Asn Met Leu 


Tyr 


Phe Asn Gly 


Lys Glu 








85 




90 






95 


Gin 


lie lie 


Tyr 


Gly 


Lys 


He Pro Ala Met 


Val 


Val 





100 105 



(2) INFORMATION FOR SEQ ID NO:9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(vii) IMMEDIATE SOURCE: 
(B) CLONE: SJL141 
5 (ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1...9 

(D) OTHER INFORMATION: /note= "Xaa at position 3=His, Gin, Asn, Lys, Asp or Glu; Xaa at position 
to 4=Asp or Asn; Xaa at positions 6 and 7=Val, lie or Met; Ala = Xaa at position 8=Ala or Ser" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9 



15 Gly Trp Xaa Xaa Trp Xaa Xaa Xaa Pro 

1 5 

(2) INFORMATION FOR SEQ ID NO:10: 
20 (j) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

30 (B) CLONE: SJL1 47 

(ix) FEATURE: 

(A) NAME/KEY: Peptide 
35 (B) LOCATION: 1...8 

(D) OTHER INFORMATION: /note= "Xaa at position 2=lle, Val, Met, Thr or Ala; Xaa at position 4=Asp or 
Glu; Xaa at position 7=Gly or Ala" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

40 

Met Xaa Val Xaa Ser Cys Xaa Cys 
1 5 



45 (2) INFORMATION FOR SEQ ID NO:11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2676 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



55 



(ii) MOLECULE TYPE: Genomic DNA 
(vii) IMMEDIATE SOURCE: 
(B) CLONE: Murine GDF-8 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 104... 1231 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: 



10 



15 



20 



25 



30 



35 



40 



45 



50 
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GTCTCTCGGA CGGTACATGC ACTAATATTT CACTTGGCAT TACTCAAAAG CAAAAAGAAG 60 
AAATAAGAAC AAGGGAAAAA AAAAGATTGT GCTGATTTTT AAA ATG ATG CAA AAA IIS 

Met Met Gin Lys 
1 

CTG CAA ATG TAT GTT TAT ATT TAC CTG TTC ATG CTG ATT GCT GCT GGC 163 
Leu Gin Met Tyr Val Tyr lie Tyr Leu Phe Met Leu lie Ala Ala Gly 
5 10 15 20 

CCA GTG GAT CTA AAT GAG GGC AGT GAG AGA GAA GAA AAT GTG GAA AAA 211 
Pro Val Asp Leu Asn Glu Gly Ser Glu Arg Glu Glu Asn Val Glu Lys 
25 30 35 

GAG GGG CTG TGT AAT GCA TGT GCG TGG AGA CAA AAC ACG AGG TAC TCC 259 
Glu Gly Leu Cys Asn Ala Cys Ala Trp Arg Gin Asn Thr Arg Tyr Ser 
40 45 50 

AGA ATA GAA GCC ATA AAA ATT CAA ATC CTC AGT AAG CTG CGC CTG GAA 307 
Arg lie Glu Ala lie Lys He Gin He Leu Ser Lys Leu Arg Leu Glu 
55 60 65 

ACA GCT CCT AAC ATC AGC AAA GAT GCT ATA AGA CAA CTT CTG CCA AGA 3S5 
Thr Ala Pro Asn He Ser Lys Asp Ala He Arg Gin Leu Leu Pro Arg 
70 75 BO 

GCG CCT CCA CTC CGG GAA CTG ATC GAT CAG TAC GAC GTC CAG AGG GAT 403 
Ala Pro Pro Leu Arg Glu Leu He Asp Gin Tyr Asp Val Gin Arg Asp 
85 90 95 100 

GAC AGC AGT GAT GGC TCT TTG GAA GAT GAC GAT TAT CAC GCT ACC ACG 451 
Asp Ser Ser Asp Gly Ser Leu Glu Asp Asp Asp Tyr His Ala Thr Thr 
105 110 115 

GAA ACA ATC ATT ACC ATG CCT ACA GAG TCT GAC TTT CTA ATG CAA GCG 499 
35 Glu Thr He He Thr Met Pro Thr Glu Ser Asp Phe Leu Met Gin Ala 

120 125 130 

GAT GGC AAG CCC AAA TGT TGC TTT TTT AAA TTT AGC TCT AAA ATA CAG 547 
Asp Gly Lys Pro Lys Cys Cys Phe Phe Lys Phe Ser Ser Lys He Gin 
40 135 140 145 

TAC AAC AAA GTA GTA AAA GCC CAA CTG TGG ATA TAT CTC AGA CCC GTC 595 
Tyr Asn Lys Val Val Lys Ala Gin Leu Trp lie Tyr Leu Arg Pro Val 

150 155 160 



10 



15 



20 



25 



30 



45 



50 



55 



AAG ACT CCT ACA ACA GTG TTT GTG CAA ATC CTG AGA CTC ATC AAA CCC 643 
Lys Thr Pro Thr Thr Val Phe Val Gin He Leu Arg Leu He Lys Pro 
165 170 175 180 

ATG AAA GAC GGT ACA AGG TAT ACT GGA ATC CGA TCT CTG AAA CTT GAC 691 
Met Lys Asp Gly Thr Arg Tyr Thr Gly He Arg Ser Leu Lys Leu Asp 
185 190 195 

ATG AGC CCA GGC ACT GGT ATT TGG CAG AGT ATT GAT GTG AAG ACA GTG 739 
Met Ser Pro Gly Thr Gly He Trp Gin Ser He Asp Val Lys Thr Val 
200 205 210 
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TACTTTGGTA TATTTTTACA CTAATGAAAT TTCAAATCAT TAAAGTACAA AGACATGTCA 2236 

TGTATCACAA AAAAGGTGAC TGCTTCTATT TCAGAGTGAA TTAGCAGATT CAATAGTGGT 2296 

CTTAAAACTC TGTATGTTAA GATTAGAAGG TTATATTACA ATCAATTTAT GTATTTTTTA 23S6 

5 CATTATCAAC TTATGGTTTC ATGGTGGCTG TATCTATGAA TGTGGCTCCC AGTCAAATTT 2416 

CAATGCCCCA CCATTTTAAA AATTACAAGC ATTACTAAAC ATACCAACAT GTATCTAAAG 2476 

AAATACAAAT ATGGTATCTC AATAACAGCT ACTTTTTTAT TTTATAATTT GACAATGAAT 2536 

ACATTTCTTT TATTTACTTC AGTTTTATAA ATTGGAACTT TGTTTATCAA ATGTATTGTA 2596 

CTCATAGCTA AATGAAATTA TTTCTTACAT AAAAATGTGT AGAAACTATA AATTAAAGTG 2656 

10 TTTTCACATT TTTGAAAGGC 2676 

(2) INFORMATION FOR SEQ ID NO:12: 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 376 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: protein 
(v) FRAGMENT TYPE: internal 
25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 
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Met Met Gin Lys Leu Gin Met Tyr Val Tyr He Tyr Leu Phe Met Leu 

15 10 is 

He Ala Ala Gly Pro Val Asp Leu Asn Glu Gly Ser Glu Arg Glu Glu 
5 20 25 30 

Asn Val Glu Lys Glu Gly Leu Cys Asn Ala Cys Ala Trp Arg Gin Asn 

35 40 45 

Thr Arg Tyr Ser Arg He Glu Ala He Lys He Gin He Leu Ser Lys 
50 55 60 

10 Leu Arg Leu Glu Thr Ala Pro Asn He Ser Lys Asp Ala He Arg Gin 

65 70 75 80 

Leu Leu Pro Arg Ala Pro Pro Leu Arg Glu Leu He Asp Gin Tyr Asp 

85 , 90 9S 

Val Gin Arg Asp Asp Ser Ser Asp Gly Ser Leu Glu Asp Asp Asp Tyr 
15 100 105 110 

His Ala Thr Thr Glu Thr He He Thr Met Pro Thr Glu Ser Asp Phe 

115 120 125 

Leu Met Gin Ala Asp Gly Lys Pro Lys Cys Cys Phe Phe Lys Phe Ser 
130 135 140 

20 Ser **ys Ile Gin Tyr Asn Lys Val Val Lys Ala Gin Leu Trp He Tyr 

145 150 155 160 

Leu Arg Pro Val Lys Thr Pro Thr Thr Val Phe Val Gin He Leu Arg 

165 170 175 

Leu He Lys Pro Met Lys Asp Gly Thr Arg Tyr Thr Gly He Arg Ser 

25 160 185 190 

Leu Lys Leu Asp Met Ser Pro Gly Thr Gly lie Trp Gin Ser He Asp 

195 200 * 205 

Val Lys Thr Val Leu Gin Asn Trp Leu Lys Gin Pro Glu Ser Asn Leu 
210 215 220 

30 Gly He Glu Ile Lys Ala Leu Asp Glu Asn Gly His Asp Leu Ala Val 

225 230 235 240 

Thr Phe Pro Gly Pro Gly Glu Asp Gly Leu Asn Pro Phe Leu Glu Val 

245 250 255 

Lys Val Thr Asp Thr Pro Lys Arg Ser Arg Arg Asp Phe Gly Leu Asp 

35 260 265 270 

Cys Asp Glu His Ser Thr Glu Ser Arg cys Cys Arg Tyr Pro Leu Thr 
275 280 "* 285 

40 Val As P Ph © Glu Ala Phe Gly Trp Asp Trp Ile Ile Ala Pro Lys Arg 

290 295 300 

Tyr Lys Ala Asn Tyr Cys Ser Gly Glu Cys Glu Phe Val Phe Leu Gin 
305 310 315 320 

Lys Tyr Pro His Thr His Leu Val His Gin Ala Asn Pro Arg Gly Ser 
45 325 330 335 

Ala Gly Pro Cys Cys Thr Pro Thr Lys Met Ser Pro Ile Asn Met Leu 

340 345 350 

Tyr Phe Asn Gly Lys Glu Gin lie Ile Tyr Gly Lys Ile Pro Ala Met 
355 360 365 

so Val Val Asp Arg Cys Gly Cys Ser 

370 375 

(2) INFORMATION FOR SEQ ID NO:13: 
55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2743 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 

5 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: Human GDF-8 
10 (ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 59...1 183 
(D) OTHER INFORMATION: 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 



AAGAAAAGTA AAAGGAAGAA ACAAGAACAA GAAAAAAGAT TATATTGATT TTAAAATC 58 

20 

ATG CAA AAA CTG CAA CTC TGT GTT TAT ATT TAC CTG TTT ATG CTG ATT 106 
Met Gin Lys Leu Gin Leu Cys Val Tyr He Tyr Leu Phe Met Leu He 
15 10 15 

GTT OCT GGT CCA GTG GAT CTA AAT GAG AAC AGT GAG CAA AAA GAA AAT 154 
Val Ala Gly Pro Val Asp Leu Asn Glu Asn Ser Glu Gin Lys Glu A9n 
20 25 30 

GTG GAA AAA GAG GGG CTG TGT AAT GCA TGT ACT TGG AGA CAA AAC ACT 202 
Val Glu Lys Glu Gly Leu Cys Asn Ala Cys Thr Trp Arg Gin Asn Thr 
30 35 40 45 

AAA TCT TCA AGA ATA GAA GCC ATT AAG ATA CAA ATC CTC AGT AAA CTT 250 

Lys Ser Ser Arg He Glu Ala He Lys He Gin He Leu Ser Lys Leu 
50 55 60 

35 

CGT CTG GAA ACA GCT CCT AAC ATC AGC AAA GAT GTT ATA AGA CAA CTT 298 

Arg Leu Glu Thr Ala Pro Asn He Ser Lys Asp Val He Arg Gin Leu 
65 70 75 80 

40 TTA CCC AAA GCT CCT CCA CTC CGG GAA CTG ATT GAT CAG TAT GAT GTC 346 

Leu Pro Lys Ala Pro Pro Leu Arg Glu Leu He Asp Gin Tyr Asp Val 
85 90 95 



45 



50 



55 
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CAG AGG GAT GAC AGC AGC GAT GGC TCT TTG GAA GAT GAC GAT TAT CAC 
Gin Arg Asp Asp Ser Ser Asp Gly Ser Leu Glu Asp Asp Asp Tyr His 
100 105 no 

GCT ACA ACG GAA ACA ATC ATT ACC ATG CCT ACA GAG TCT GAT TTT CTA 
Ala Thr Thr Glu Thr He He Thr Met Pro Thr Glu Ser Asp Phe Leu 
1" 120 12s 

ATG CAA GTG GAT GGA AAA CCC AAA TGT TGC TTC TTT AAA TTT AGC TCT 
Met Gin val Asp Gly Lys Pro Lys Cys Cys Phe Phe Lys Phe Ser Ser 
130 135 14Q 

AAA ATA CAA TAC AAT AAA GTA GTA AAG GCC CAA CTA TGG ATA TAT TTG 
Lys He Gin Tyr Asn Lys Val Val Lys Ala Gin Leu Trp He Tyr Leu 
145 ISO iss 160 

AGA CCC GTC GAG ACT CCT ACA ACA GTG TTT GTG CAA ATC CTG AGA CTC 
Arg Pro Val Glu Thr Pro Thr Thr Val Phe Val Gin He Leu Arg Leu 

165 170 175 

ATC AAA CCT ATG AAA GAC GGT ACA AGG TAT ACT GGA ATC CGA TCT CTG 
He Lys Pro Met Lys Asp Gly Thr Arg Tyr Thr Gly He Arg Ser Leu 
ISO iss ' 190 

AAA CTT GAC ATG AAC CCA GGC ACT GGT ATT TGG CAG AGC ATT GAT GTG 
Lys Leu Asp Met Asn Pro Gly Thr Gly He Trp Gin Ser He Asp Val 
I 95 200 205 

AAG ACA GTG TTG CAA AAT TGG CTC AAA CAA CCT GAA TCC AAC TTA GGC 
Lys Thr Val Leu Gin Asn Trp Leu Lys Gin Pro Glu Ser Asn Leu Gly 
210 21S 220 

ATT GAA ATA AAA GCT TTA GAT GAG AAT GGT CAT GAT CTT GCT GTA ACC 
He Glu He Lys Ala Leu Asp Glu Asn Gly His Asp Leu Ala Val Thr 
225 23 °- 235 240 

TTC CCA GGA CCA GGA GAA GAT GGG CTG AAT CCG TTT TTA GAG GTC AAG 
Phe Pro Gly Pro Gly Glu Asp Gly Leu Asn Pro Phe Leu Glu Val Lys 
245 250 2S5 

GTA ACA GAC ACA CCA AAA AGA TCC AGA AGG GAT TTT GGT CTT GAC TGT 
Val Thr Asp Thr Pro Lys Arg Ser Arg Arg Asp Phe Gly Leu Asp Cys 
260 265 270 

GAT GAG CAC TCA ACA GAA TCA CGA TGC TGT CGT TAC CCT CTA ACT GTG 
Asp Glu His Ser Thr Glu Ser Arg Cys Cys Arg Tyr Pro Leu Thr Val 
27S 280 285 

GAT TTT GAA GCT TTT GGA TGG GAT TGG ATT ATC GCT CCT AAA AGA TAT 
Asp Phe Glu Ala Phe Gly Trp Asp Trp He He Ala Pro Lys Arg Tyr 
290 295 3 oo 

AAG GCC AAT TAC TGC TCT GGA GAG TGT GAA TTT GTA TTT TTA CAA AAA 
Lys Ala Asn Tyr Cys Ser Gly Glu Cys Glu Phe Val Phe Leu Gin Lys 
305 31° 315 320 

TAT CCT CAT ACT CAT CTG GTA CAC CAA GCA AAC CCC AGA GGT TCA GCA 
Tyr Pro His Thr His Leu Val His Gin Ala Asn Pro Arg Gly Ser Ala 
325 330 335 
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GGC CCT TGC TGT ACT CCC ACA AAG ATG TCT CCA ATT AAT ATG CTA TAT 1114 
Gly Pro Cys Cys Thr Pro Thr Lys Met Ser Pro lie Asn Met Leu Tyr 
340 345 350 

5 

TTT AAT GGC AAA GAA CAA ATA ATA TAT GGG AAA ATT CCA GCG ATG GTA 1162 
Phe Asn Gly Lys Glu Gin lie He Tyr Gly Lys He Pro Ala Met Val 
355 360 365 

W GTA GAC CGC TGT GGG TGC TCA TGAGATTTAT ATTAAGCGTT CATAACTTCC TAAAAC 1219 
Val Asp Arg Cys Gly Cys Ser 
370 375 

ATGGAAGGTT TTCCCCTCAA CAATTTTGAA GCTGTGAAAT TAAGTACCAC AGGCTATAGG 1279 

15 CCTAGAGTAT GCTACAGTCA CTTAAGCATA AGCTACAGTA TGTAAACTAA AAGGGGGAAT 1339 

ATATGCAATG GTTGGCATTT AACCATCCAA ACAAATCATA CAAGAAAGTT TTATGATTTC 1399 

CAGAGTTTTT GAGCTAGAAG GAGATCAAAT TACATTTATG TTCCTATATA TTACAACATC 1459 

GGCGAGGAAA TGAAAGCGAT TCTCCTTGAG TTCTGATGAA TTAAAGGAGT ATGCTTTAAA 1519 

GTCTATTTCT TTAAAGTTTT GTTTAATATT TACAGAAAAA TCCACATACA GTATTGGTAA 1579 

2Q AATGCAGGAT TGTTATATAC CATCATTCGA ATCATCCTTA AACACTTGAA TTTATATTGT 1639 

ATGGTAGTAT ACTTGGTAAG ATAAAATTCC ACAAAAATAG GGATGGTGCA GCATATGCAA 1699 

TTTCCATTCC TATTATAATT GACACAGTAC ATTAACAATC CATGCCAACG GTGCTAATAC 1759 

GATAGGCTGA ATGTCTGAGG CTACCAGGTT TATCACATAA AAAACATTCA GTAAAATAGT 1819 

AAGTTTCTCT TTTCTTCAGG TGCATTTTCC TACACCTCCA AATGAGGAAT GG ATTTTCTT 1879 

TAATGTAAGA AGAATCATTT TTCTAGAGGT TGGCTTTCAA TTCTGTAGCA TACTTGGAGA 1939 

25 AACTGCATTA TCTTAAAAGG CAGTCAAATG GTGTTTGTTT TTATCAAAAT GTCAAAATAA 1999 

CATACTTGGA GAAGTATGTA ATTTTGTCTT TGGAAAATTA CAACACTGCC TTTGCAACAC 2059 

TGCAGTTTTT ATGGTAAAAT AATAGAAATG ATCGACTCTA TCAATATTGT ATAAAAAGAC 2119 

TGAAACAATG CATTTATATA ATATGTATAC AATATTGTTT TGTAAATAAG TGTCTCCTTT 2179 

TTTATTTACT TTGGTATATT TTTACACTAA GGACATTTCA AATTAAGTAC TAAGGCACAA 223 9 

30 AGACATGTCA TGCATCACAG AAAAGCAACT ACTTATATTT CAGAGCAAAT TAGCAGATTA 2299 

AATAGTGGTC TTAAAACTCC ATATGTTAAT GATTAGATGG TTATATTACA ATCATTTTAT 2359 

ATTTTTTTAC ATGATTAACA TTCACTTATG GATTCATGAT GGCTGTATAA AGTGAATTTG 2419 

AAATTTCAAT GGTTTACTGT CATTGTGTTT AAATCTCAAC GTTC CATTAT TTTAATACTT 2479 

GCAAAAACAT TACTAAGTAT ACCAAAATAA TTGACTCTAT TATCTGAAAT GAAGAATAAA 2539 

35 CTGATGCTAT CTCAACAATA ACTGTTACTT TTATTTTATA ATTTGATAAT GAATATATTT 2599 

CTGCATTTAT TTACTTCTGT TTTGTAAATT GGGATTTTGT TAATCAAATT TATTGTACTA 2659 

TGACTAAATG AAATTATTTC TTACATCTAA TTTGTAGAAA CAGTATAAGT TATATTAAAG 2719 

TGTTTTCACA TTTTTTTGAA AGAC 2743 



40 (2) INFORMATION FOR SEQ ID NO:1 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 375 amino acids 
45 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
50 ( V ) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 



55 



51 



EP 0 690 873 B1 



Met Gin Lys Leu Gin Leu Cys Val Tyr lie Tyr Leu Phe Met Leu lie 

1 5 10 15 

Val Ala Gly Pro Val Asp Leu Asn Glu Asn Ser Glu Gin Lys Glu Asn 

20 25 30 

Val Glu Lys Glu Gly Leu Cys Asn Ala Cys Thr Trp Arg Gin Asn Thr 
35 40 45 



Lys Ser Ser Arg He Glu Ala He Lys lie Gin He Leu Ser Lys Leu 

50 55 60 

Arg Leu Glu Thr Ala Pro Asn He Ser Lys Asp Val He Arg Gin Leu 
65 70 75 80 

Leu Pro Lys Ala Pro Pro Leu Arg Glu Leu He Asp Gin Tyr Asp Val 

85 90 95 

Gin Arg Asp Asp Ser Ser Asp Gly Ser Leu Glu Asp Asp Asp Tyr His 

100 105 110 

Ala Thr Thr Glu Thr He He Thr Met Pro Thr Glu Ser Asp Phe Leu 

115 120 125 

Met Gin Val Asp Gly Lys Pro Lys Cys Cys Phe Phe Lys Phe Ser Ser 

130 135 140 

Lys He Gin Tyr Asn Lys Val Val Lys Ala Gin Leu Trp He Tyr Leu 
145 150 155 160 

Arg Pro Val Glu Thr Pro Thr Thr Val Phe Val Gin lie Leu Arg Leu 

165 170 175 

He Lys Pro Met Lys Asp Gly Thr Arg Tyr Thr Gly lie Arg Ser Leu 

180 165 190 

Lys Leu Asp Met Asn Pro Gly Thr Gly He Trp Gin Ser He Asp Val 

195 200 205 

Lys Thr Val Leu Gin Asn Trp Leu Lys Gin Pro Glu Ser Asn Leu Gly 
210 215 220 

lie Glu He Lys Ala Leu Asp Glu Asn Gly His Asp Leu Ala Val Thr 
225 230 235 240 

Phe Pro Gly Pro Gly Glu Asp Gly Leu Asn Pro Phe .Leu Glu Val Lys 

245 250 255 

Val Thr Asp Thr Pro Lys Arg Ser Arg Arg Asp Phe Gly Leu Asp Cys 

260 265 * 270 

Asp Glu His Ser Thr Glu Ser Arg Cys Cys Arg Tyr Pro Leu Thr Val 

275 280 "* 285 

Asp Phe Glu Ala Phe Gly Trp Asp Trp He He Ala Pro Lys Arg Tyr 

290 295 300 

Lys Ala Asn Tyr Cys Ser Gly Glu Cys Glu Phe Val Phe Leu Gin Lys 
305 310 315 320 

Tyr Pro His Thr His Leu Val His Gin Ala Asn Pro Arg Gly Ser Ala 

325 330 335 

Gly Pro Cys Cys Thr Pro Thr Lys Met Ser Pro He Asn Met Leu Tyr 

340 345 350 

Phe Asn Gly Lys Glu Gin He lie Tyr Gly Lys He Pro Ala Met Val 
355 360 365 

Val Asp Arg Cys Gly Cys Ser 
370 375 



(2) INFORMATION FOR SEQ ID NO:15: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 34 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: #83 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..34 

(C) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 
CGCGGATCCG TGGATCTAAA TGAGAACAGT GAGC 

(2) INFORMATION FOR SEQ ID NO:16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Genomic DNA 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: #84 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..37 

(C) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 
CGCGAATTCT CAGGTAATGA TTGTTTCCGT TGTAGCG 

(2) INFORMATION FOR SEQ ID NO:17: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: Genomic DNA 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: #100 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..20 

(C) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 



ACACTAAATC TTCAAGAATA 

(2) INFORMATION FOR SEQ ID NO:18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: GDF-1 
(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1..123 

(C) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 



Arg Pro Arg Arg Asp Ala Glu Pro Val Leu Gly Gly Gly Pro Gly Gly 

1 5 10 15 

Ala Cys Arg Ala Arg Arg Leu Tyr Val Ser Phe Arg Glu Val Gly Trp 

20 25 30 

His Arg Trp Val lie Ala Pro Arg Gly Phe Leu Ala Asn Tyr Cys Gin 

35 40 45 

Gly Gin Cys Ala Leu Pro Val Ala Leu Ser Gly Ser Gly Gly Pro Pro 

50 55 so 

Ala Leu Asn His Ala Val Leu Arg Ala Leu Met His Ala Ala Ala Pro 
55 70 75 80 

Gly Ala Ala Asp Leu Pro Cys Cys Val Pro Ala Arg Leu Ser Pro lie 

85 90 95 

Ser val Leu Phe Phe Asp Asn Ser Asp Asn Val Val Leu Arg Gin Tyr 

100 105 no 

Glu Asp Met Val Val Asp Glu Cys Gly Cys Arg 
115 120 

(2) INFORMATION FOR SEQ ID NO:19: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 118 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: BMP-2 
(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1.118 
(D) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: 



Arg Glu Lys Arg Gin Ala Lys His 

1 5 
Ser Cys Lys Arg His Pro Leu Tyr 
20 

Asn Asp Trp lie Val Ala Pro Pro 
35 40 
Gly Glu Cys Pro Phe Pro Leu Ala 
50 55 



Lys Gin Arg Lys Arg Leu Lys Ser 

10 15 
Val Asp Phe Ser Asp Val Gly Trp 

25 30 
Gly Tyr His Ala Phe Tyr Cys His 
45 

Asp His Leu Asn Ser Thr Asn His 
60 



Ala He Val Gin Thr Leu Val Asn Ser Val Asn Ser Lys ile Pro Lys 

65 70 75 80 

Ala Cys Cys Val Pro Thr Glu Leu Ser Ala Ile Ser Met Leu Tyr Leu 

85 90 95 

Asp Glu Asn Glu Lys Val Val Leu Lys Asn Tyr Gin Asp Met Val Val 

100 105 110 

Glu Gly Cys Gly Cys Arg 
115 



(2) INFORMATION FOR SEQ ID NO:20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 118 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: BMP-4 
(ix) FEATURE: 
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(A) NAME/KEY: Protein 

(B) LOCATION: 1..118 
(D) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 







Ser 


Pro 


Lvs 


His His Ser Gin Arg Ala Arg Lys Lys 


Asn Lys 


1 








5 


10 


15 


Ash 


Cys 


Arc? 


Ara 
20 


His 


Ser Leu Tyr Val Asp Phe Ser Asp Val 
25 30 


Gly Trp 


Asn 


Asp 


Trp 
35 


He 


Val 


Ala Pro Pro Gly Tyr Gin Ala Phe Tyr 
40 45 


Cys His 


Gly 


Asp 
50 


Cys 


Pro 


Phe 


Prq Leu Ala Asp His Leu Asn Ser Thr 
55 60 


Asn His 


Ala 


lie 


Val 


Gin 


Thr 


Leu Val Asn Ser Val Asn Ser Ser He 


Pro Lys 


65 










70 75 


60 


Ala 


Cys 


Cys 


Val 


Pro 
85 


Thr Glu Leu Ser Ala He Ser Met Leu 
90 


Tyr Leu 
95 


Asp 


Glu 


Tyr 


Asp 
100 


Lys 


Val Val Leu Lys Asn Tyr Gin Glu Met 
105 110 


Val Val 


Glu 


Gly 


Cys 
115 


Gly 


Cys 


Arg 





(2) INFORMATION FOR SEQ ID NO:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: Vgr-1 
(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1..119 
(D) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 
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Ser Arg 


Gly 


Ser 


Gly 


Ser 


Ser 


Asp Tyr Asn Gly 


Ser 


Glu 


Leu 


Lys Thr 


1 






5 








10 










15 


Ala Cys 


Lys 


Lys 


His 


Glu 


Leu 


Tyr Val 


Ser 


Phe 


Gin 


Asp 


Leu 


Gly Trp 






20 








25 










30 




Gin Asp 


Trp 


lie 


He 


Ala 


Pro 


Lys Gly 


Tyr 


Ala 


Ala 


Asn 


Tyr 


Cys Asp 




35 










40 








45 






Gly Glu 


Cys 


Ser 


Phe 


Pro 


Leu 


Asn Ala 


His 


Met 


Asn 


Ala 


Thr 


Asn His 


SO 










55 








60 








Ala lie 


Val 


Gin 


Thr 


Leu 


Val 


His Leu 


Met 


Asn 


Pro 


Glu 


Tyr 


Val Pro 


65 








70 








75 








80 


Lys Pro 


Cys 


Cys 


Ala 


Pro 


Thr 


Lys Leu 


Asn 


Ala 


He 


Ser 


Val 


Leu Tyr 








85 








90 










95 


Phe Asp 


Asp 


Asn 


Ser 


Asn 


Val 


He Leu 


Lys 


Lys 


Tyr Arg 


Asn 


Met Val 






100 








105 










110 




Val Arg 


Ala 


Cys Gly Cys His 


















115 

























(2) INFORMATION FOR SEQ ID NO:22: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(vii) IMMEDIATE SOURCE: 

30 

(B) CLONE: OP-1 

(ix) FEATURE: 

35 (A) NAME/KEY: Protein 

(B) LOCATION: 1..119 
(D) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

40 





Leu 


Arg 


Met 


Ala 


Asn 


val 


Ala 


Glu 


Asn Ser 


Ser Ser Asp Gin Arg 


Gin 




1 








5 








10 


15 






Ala 


Cys 


Lys 


Lys 


His 


Glu 


Leu 


Tyr 


Val Ser 


Phe Arg Asp Leu Gly 


Trp 


45 








20 










25 


30 






Gin 


Asp 


Trp 
35 


He 


He 


Ala 


Pro 


Glu 
40 


Gly Tyr 


Ala Ala Tyr Tyr Cys 
45 


Glu 




Gly 


Glu 
50 


Cys 


Ala 


Phe 


Pro 


Leu 
55 


Asn 


Ser Tyr 


Met Asn Ala Thr Asn 
60 


His 


50 


Ala 
65 


He 


val 


Gin 


Thr 


Leu 
70 


Val 


His 


Phe He 


Asn Pro Glu Thr Val 
75 


Pro 
80 




Lys 


Pro 


Cys 


Cys 


Ala 
85 


Pro 


Thr 


Gin 


Leu Asn 
90 


Ala He Ser Val Leu 
95 


Tyr 




Phe 


Asp 


Asp 


ser 


ser 


Asn 


Val 


He 


Leu Lys 


Lys Tyr Arg Asn Met 


val 


55 








100 










105 


110 






Val 


Arg 


Ala 
115 


Cys 


Gly 


Cys 


His 
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(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: BMP-5 
(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1..119 
(D) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 



Ser Arg Met Ser Ser Val Gly Asp Tyr Asn Thr Ser Glu Gin Lys Gin 
Ala Cys Lys Lys His Glu Leu Tyr Val Ser Phe Arg Asp Leu lly Trp 



Gin Asp Trp He lie Ala Pro Glu Gly Tyr Ala Ala Phe Tyr Cys Asp 

Glu Cys Ser Phe Pro Le\ 
SO 55 



Gly Glu cys ser Phe Pro Leu L'n Ala His Met Asn Ala Thr Asn His 



Ala He val Gin Thr Leu Val His Leu Met Phe Pro Asp His Val Pro 

Lys Pro cys Cys Ala Pro Thr Lys Leu Asn 2a He Ser Val Leu £yr 

90 

Phe Asp Asp Ser Ser Asn Val He Leu Lys Lys Tyr Arg Asn Met Val 

105 

Val Arg Ser Cys Gly Cys His *~ 110 

115 

(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 120 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: BMP-3 
(ix) FEATURE: 

(A) NAME/KEY: Protein 
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(B) LOCATION: 1..120 
(D) OTHER: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 



Glu Gin Thr Leu Lys Lys Ala Arg Arg Lys Gin Trp He Glu Pro Arg 

15 10 15 

Asn Cys Ala Arg Arg Tyr Leu Lys Val Asp Phe Ala Asp He Gly Trp 



20 

Ser Glu Trp He He Ser Pro Lys 
35 40 

Gly Ala Cys Gin Phe Pro Met Pro 

50 55 
Ala Thr He Gin Ser He Val Arg 
65 70 
Pro Glu Pro Cys Cys Val Pro Glu 
85 

Phe Phe Asp Glu Asn Lys Asn Val 
100 

Thr Val Glu Ser Cys Ala Cys Arg 
115 120 



25 30 
Ser Phe Asp Ala Tyr Tyr Cys Ser 
45 

Lys Ser Leu Lys Pro Ser Asn His 
60 

Ala Val Gly Val Val Pro Gly He 

75 80 
Lys Met Ser Ser Leu Ser He Leu 

90 95 
Val Leu Lys Val Tyr Pro Asn Met 
105 110 



(2) INFORMATION FOR SEQ ID NO:25: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 116 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(Ii) MOLECULE TYPE: protein 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: MIS 

(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1..116 
(D) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
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Gly Pro Gly Arg 
1 

Pro Cys 



Val Leu 

Gly Trp 

50 
Leu Leu 
65 

Cys Cys 
Glu Glu 
Cys Gly 



Ala Leu 

20 
He Pro 
35 

Pro Gin 

Leu Lys 

Val Pro 

Arg lie 
100 
Cys Arg 
115 



Ala Gin. Arg 
5 

Arg Glu Leu 



Ser Ala 



Ser Val 
25 

Glu Thr Tyr Gin Ala 
40 

Asn Pro 



Ser Asp Arg 
55 

Met Gin Ala 
70 

Thr Ala Tyr 
85 

Ser Ala His 



Arg Gly 

Ala Gly 

His Val 
105 



Gly Ala 
10 

Asp Leu 

Asn Asn 

Arg Tyr 

Ala Ala 

75 
Lys Leu 
90 

Pro Asn 



Thr Ala 
Arg Ala 



Ala Asp Gly 
15 

Glu Arg Ser 
30 

Gly Val Cys 



Cys Gin 
45 

Gly Asn His Val Val 
60 

Leu Ala 



Leu He 
Met Val 



Arg Pro Pro 
80 

Ser Leu Ser 
95 

Ala Thr Glu 
110 



(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 122 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: Inhibin-alpha 
(ix) FEATURE: 



(A) NAME/KEY: Protein 

(B) LOCATION: 1..122 
(D) OTHER: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 



Ala Leu Arg Leu 
1 

Asn Cys 



Glu Arg 

Gly Gly 

50 
Gly Ala 
65 

Gin Pro 
Arg Thr 
Asn Leu 



His Arg 

20 
Trp lie 
35 

Cys Gly 
Pro Pro 
Cys Cys 



Leu Gin Arg 
5 

Val Ala Leu 

Val Tyr Pro 

Leu His lie 
55 

Thr Pro Ala 
70 

Ala Ala Leu 
85 

Asp Gly Gly 



Thr Ser 
100 

Leu Thr Gin His Cys 
115 



Pro Pro 

Asn He 

25 
Pro Ser 
40 

Pro Pro 

Gin Pro 

Pro Gly 

Tyr Ser 
105 
Ala Cys 
120 



Glu Glu Pro 
10 

Ser Phe Gin 

Phe lie Phe 

Asn Leu Ser 
60 

Tyr Ser Leu 
75 

Thr Met Arg 
90 

Phe Lys Tyr 
He 



Ala Ala His Ala 
15 

Glu Leu Gly Trp 
30 

His Tyr Cys His 
45 

Leu Pro Val Pro 

Leu Pro Gly Ala 
80 

Pro Leu His Val 
95 

Glu Thr Val Pro 
110 
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(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 1 22 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

10 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: tnhibin-beta-alpha 
15 (jx) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1..122 
(D) OTHER: 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 



His 


Arg 


Arg 


Arg Arg 


Arg Gly Leu Glu Cys Asp 


Gly 


Lys 


Val Asn He 


1 






5 


10 






15 


Cys 


Cys 


Lys 


Lys Gin 


Phe Phe Val Ser Phe Lys 


Asp 


He 


Gly Trp Asn 








20 


25 






30 


Asp 


Trp 


He 


lie Ala 


Pro Ser Gly Tyr His Ala 


Asn 


Tyr 


Cys Glu Gly 






3S 




40 




45 




Glu 


Cys 


Pro 


Ser His 


He Ala Gly Thr Ser Gly 


Ser 


Ser 


Leu Ser Phe 




SO 






55 


60 






His 


Ser 


Thr 


Val He 


Asn His Tyr Arg Met Arg 


Gly 


His 


Ser Pro Phe 


65 








70 75 






80 


Ala 


Asn 


Leu 


Lys Ser 


Cys Cys Val Pro Thr Lys 


Leu 


Arg 


Pro Met Ser 








85 


90 






95 


Met 


Leu 


Tyr 


Tyr Asp. 


Asp Gly Gin Asn He He 


Lys 


Lys 


Asp He Gin 








100 


105 






110 


Asn 


Met 


He 


Val Glu 


Glu Cys Gly Cys Ser 












115 




120 









45 (2) INFORMATION FOR SEQ ID NO:28: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 121 amino acids 
50 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
55 (vii) IMMEDIATE SOURCE: 

(B) CLONE: Inhibin-beta-beta 
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(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1 ..121 
(D) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 



His 


Arg 


lie 


ATQ 


Lys 


Arg 


Gly Leu Glu Cys 


Asp 


Gly Arg 


Thr 


Asn Leu 


1 






5 




10 








15 


Cys 


Cys 


Arg 


Gin 


Gin 


Phe 


Phe He Asp Phe 


Arg 


Leu He 


Gly 


Trp Asn 




20 






25 






30 




Asp 


Trp 


He 
35 


He 


Ala 


Pro 


Thr Gly Tyr Tyr 
40 


Gly 


Asn Tyr 
45 


Cys 


Glu Gly 


Ser 


Cys 
50 


Pro 


Ala 


Tyr 


Leu 


Ala Gly Val Pro 
55 


Gly 


Ser Ala 
60 


Ser 


Ser Phe 


His 


Thr 


Ala 


Val 


Val 


Asn 


Gin Tyr Arg Met 


Arg 


Gly Leu 


Asn 


Pro Gly 


65 










70 m 




75 






00 


Thr 


Val 


Asn 


Ser 


Cys 
05 


Cys 


He Pro Thr Lys 
90 


Leu 


Ser Thr 


Met 


Ser Met 
95 


Leu 


Tyr Phe 


Asp 


Asp 


Glu 


Tyr Asn He Val 


Lys 


Arg Asp 


Val 


Pro Asn 








100 






105 






110 




Met 


He 


Val 
115 


Glu 


Glu 


Cys 


Gly Cys Ala 
120 











(2) INFORMATION FOR SEQ ID NO:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: TGF-beta-1 
(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1 ..115 
(D) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
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His Arg Arg Ala Leu Asp Thr Asn Tyr Cys Phe Ser Ser Thr Glu Lys 

15 10 15 

Asn Cys Cys Val Arg Gin Leu Tyr lie Asp Phe Arg Lys Asp Leu Gly 

20 25 30 

Trp Lys Trp lie His Glu Pro Lys Gly Tyr His Ala Asn Phe Cys Leu 

35 40 45 

Gly Pro Cys Pro Tyr He Trp Ser Leu Asp Thr Gin Tyr Ser Lys Val 

50 55 60 

Leu Ala Leu Tyr Asn Gin His Asn Pro Gly Ala Ser Ala Ala Pro Cys 
65 70 75 80 

Cys Val Pro Gin Ala Leu Glu Pro Leu Pro He Val Tyr Tyr Val Gly 

85 90 95 

Arg Lys Pro Lys Val Glu Gin Leu Ser Asn Met He Val Arg Ser Cys 
100 105 110 

Lys Cys Ser 
115 



20 (2) INFORMATION FOR SEQ ID NO:30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 amino acids 
25 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
30 (vii) IMMEDIATE SOURCE: 

(B) CLONE: TGF-beta-2 
(ix) FEATURE: 

35 

(A) NAME/KEY: Protein 

(B) LOCATION: 1 ..115 
(D) OTHER: 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 





Lys 


Lys 


Arg 


Ala 


Leu 


Asp 


Ala 


Ala Tyr Cys Phe Arg Asn Val Gin Asp 




l 








5 






10 15 


45 


Asn 


Cys 


Cys 


Leu 
20 


Arg 


Pro 


Leu 


Tyr He Asp Phe Lys Arg Asp Leu Gly 
25 30 




Trp 


Lys 


Trp 

35 


lie 


His 


Glu 


Pro 


Lys Gly Tyr Asn Ala Asn Phe Cys Ala 
40 45 




Gly 


Ala 


Cys 


Pro 


Tyr 


Leu 


Trp 


Ser Ser Asp Thr Gin His Ser Arg Val 


50 




SO 










55 


60 




Leu 


Ser 


Leu 


Tyr 


Asn 


Thr 


He 


Asn Pro Glu Ala Ser Ala Ser Pro Cys 




65 










70 




75 80 




Cys 


val 


Ser 


Gin 


Asp 
85 


Leu 


GlU 


Pro Leu Thr He Leu Tyr Tyr He Gly 
90 95 


55 


Lys 
Lys 


Thr 
Cys 


Pro 

Ser 
115 


Lys 
100 


He 


Glu 


Gin 


Leu Ser Asn Met He Val Lys Ser Cys 
105 • 110 
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(2) INFORMATION FOR SEQ ID NO:31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(vii) IMMEDIATE SOURCE: 

(B) CLONE: TGF-beta-3 
(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1..115 
(D) OTHER: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: 



Lys Lys 
1 

Asn Cys 

Trp Lys 

Gly Pro 

50 
Leu Gly 
65 

Cys Val 
Arg Thr 
Leu Cye 



Arg Ala Leu 
5 

Cys Val Arg 
20 

Trp Val His 
35 

Cys Pro Tyr 

Leu Tyr Asn 

Pro Gin Asp 
85 

Pro Lys Val 
100 

Ser 
115 



Asp Thr Asn 

Pro Leu Tyr 

Glu Pro Lys 
40 

Leu Arg Ser 
55 

Thr Leu Asn 
70 

Leu Glu Pro 
Glu Gin Leu 



Tyr Cys 

10 
He Asp 
25 

Gly Tyr 

Ala Asp 

Pro Glu 

Leu Thr 

90 
Ser Asn 
105 



Phe Arg Asn Leu 
Phe Arg 
Tyr Ala 



Thr Thr 

60 
Ala Ser 
75 

He Leu 
Met Val 



Gin Asp 

30 
Asn Phe 
45 

His Ser 



Ala Ser 

Tyr Tyr 

Val Lys 
110 



Glu Glu 
15 

Leu Gly 

Cys Ser 

Thr Val 

Pro Cys 

80 
Val Gly 
95 

Ser Cys 



(2) INFORMATION FOR SEQ ID NO:32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(ix) FEATURE: 



(A) NAME/KEY: Peptide 

(B) LOCATION: 1..118 

(C) OTHER: where X at position 2 and 3 is any amino acid 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 
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Arg Xaa Xaa Arg 
1 



Claims 

1. A polynucleotide sequence encoding a growth differentiation factor-8 polypeptide (GDF-8) or a part thereof selected 
from the group consisting of: 



(a) SEQ ID NO. 11; 

(b) nucleotides 151 to 1282 of SEQ ID NO. 11; 

(c) nucleotides 952 to 1282 of SEQ ID NO. 11; 

(d) SEQ ID NO. 13; 

15 (e) nucleotides 106 to 1233 of SEQ ID NO. 13; 

(f) nucleotides 904 to 1233 of SEQ ID NO. 13; 

(g) sequences which are degenerate as a result of the genetic code with respect to those of (a) to (f); 

(h) sequences which are complementary to those of (a) to (g); and 

(i) fragments of (a) to (h) that are at least 15 bases in length and that will selectively hybridise under stringent 
20 conditions to genomic DNA which encodes the GDF-8 protein of SEQ ID NO. 12 or 14. 

2. The polynucleotide sequence of claim 1 , wherein the polynucleotide is isolated from a mammalian cell. 

3. The polynucleotide of claim 2, wherein the mammalian cell is a mouse, rat or human cell. 

25 

4. The polynucleotide sequence or fragments thereof of any one of claims 1 to 3 which are DNA sequences. 

5. An expression vector including a DNA sequence of claim 4. 
30 6. The vector of claim 5, which is a plasmid. 

7. The vector of claim 5, which is a virus. 

8. A host cell stably transformed with the vector of any one of claims 5 to 7. 

35 

9. The host cell of claim 8, wherein the cell is prokaryotic or eukaryotic. 

10. GDF-8 or a functional fragment thereof encoded by a polynucleotide or DNA sequence of any one of claims 1 to 4. 

40 1 1 . A method for the production of the GDF-8 or functional fragment thereof of claim 10, comprising culturing the host 
cell of claim 8 or 9 and isolating said GDF-8 or functional fragment thereof from the culture. 

12. Antibodies or fragments thereof reactive with the GDF-8 or functional fragments thereof of claim 10. 

45 13. The antibodies of claim 12, wherein the antibodies are polyclonal or monoclonal. 

14. A diagnostic composition comprising the antibody or fragment thereof of claim 12 or 13. 

15. A method of detecting a cell proliferation disorder in vitro, comprising contacting the antibody or fragment thereof 
50 of claim 12 or 13 with a specimen of a subject suspected of having a GDF-8 associated disorder and detecting 

binding of the antibody or the fragment thereof. 

16. The method of claim 15, wherein the specimen comprises a muscle cell. 

55 17. The method of claim 15 or 16, wherein the antibody or fragment thereof is detectably labelled. 

18. The method of claim 17, wherein the label is a radioisotope, a fluorescent compound, a bioluminescent compound, 
a chemiluminescent compound or an enzyme. 
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19. An antisense sequence under stringent conditions that is complementary to and capable of hybridising with at 
least is nucleotides of the polynucleotide sequence of any one of claims 1 to 4. 

20. A ribozyme that is capable of recognising and cleaving the polynucleotide sequence of any one of claims 1 to 4. 

21 . A therapeutic composition comprising an antibody or fragment thereof of claim 1 2 or 1 3, an antisense sequence 
of claim 1 9 or a ribozyme of claim 20. 

22. Use of an antibody or fragment thereof of claim 1 2 or 1 3, an antisense sequence of claim 1 9 or a ribozyme of claim 
20 as a reagent which suppresses the GDF-8 activity for the preparation of a composition for the treatment of a 
cell proliferation disorder associated with expression of GDF-8. 

23. The use of claim 22 wherein said cell is a muscle cell. 

24. The use of claim 22 or 23, wherein the reagent which suppresses GDF-8 activity is introduced into a cell using a 
vector. 

25. The use of claim 24, wherein the vector is a colloidal dispersion system. 

26. The use of claim 25, wherein the colloidal dispersion system is a liposome. 

27. The use of claim 26, wherein the liposome is essentially target specific. 

28. The use of claim 26 or 27, wherein the liposome is anatomically targeted. 

29. The use of any one of claims 26 to 28, wherein the liposome is mechanistically targeted. 

30. The use of claim 29, wherein the mechanistic targeting is passive or active. 

31. The use of claim 30, wherein the liposome is actively targeted by coupling with a moiety selected from the group 
consisting of a sugar, a glycolipid and a protein. 

32. The use of claim 24, wherein the vector is a virus. 

33. The use of claim 32, wherein the virus is an RNA virus. 

34. The use of claim 33, wherein the RNA virus is a retrovirus. 

35. The use of claim 34, wherein the retrovirus is essentially target specific. 

36. The use of claim 35, wherein a moiety for target specificity is encoded by a polynucleotide inserted into the retroviral 
genome. 

37. The use of claim 36, wherein a moiety for target specificity is selected from the group consisting of a sugar a 
glycolipid and a protein. y 

38. The use of claim 31 or 37, wherein the protein is an antibody. 
Patentanspruche 

1 . Poiynukleotidsequenz, codierend fur ein Wachstumsdifferenzierungsfaktor-8-Polypeptid (GDF-8) oder einen Teil 
davon, ausgewahlt aus der Gruppe bestehend aus: 

(a) SEQ-ID-Nr. 11; 

(b) den Nukleotiden 151 bis 1282 von SEQ-ID-Nr. 11; 

(c) den Nukleotiden 952 bis 1282 von SEQ-ID-Nr. li- 
fe!) SEQ-ID-Nr. 13; 
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(e) den Nukleotiden 106 bis 1233 von SEQ-ID-Nr. 13; 

(f) den Nukleotiden 904 bis 1233 von SEQ-ID-Nr. 13; 

(g) Sequenzen, welche infolge des genetischen Codes degeneriert bezuglich derjenigen von (a) bis (f) sind; 

(h) Sequenzen, welche zu denjenigen von (a) bis (g) komplementar sind; und 

(i) Fragmenten von (a) bis (h), welche mindestens 15 Basen Lange aufweisen und welche unter stringenten 
Bedingungen selektiv mit genomischer DNA hybridisieren werden, die fur das GDF-8-Protein von SEQ-ID-Nr. 
12 oder 14 codiert; 

2. Polynukleotidsequenz nach Anspruch 1 , wobei das Polynukleotid aus einer Saugerzelle isoliert ist. 

3. Polynukleotid nach Anspruch 2, wobei die Saugerzelle eine Maus-, Ratten- Oder Humanzelle ist. 

4. Polynukleotidsequenz oder Fragmente davon nach irgendeinem der Anspruche 1 bis 3, welche DNA-Sequenzen 
sind. 

5. Express ionsvektor, umfassend eine DNA-Sequenz nach Anspruch 4. 

6. Vektor nach Anspruch 5, welcher ein Plasmid ist. 

7. Vektor nach Anspruch 5, welcher ein Virus ist. 

8. Wirtszelle, welche mit dem Vektor nach irgendeinem der Anspruche 5 bis 7 stabil transformiert ist. 

9. Wirtszelle nach Anspruch 8, wobei die Zelle prokaryotisch oder eukaryotisch ist. 

10. GDF-8 oder ein funktionelles Fragment davon, codiert von einem Polynukleotid oder einer DNA-Sequenz nach 
irgendeinem der Anspruche 1 bis 4. 

11. Verfahren zur Herstellung des GDF-8 oder funktionellen Fragments davon nach Anspruch 10, umfassend die 
Kultivierung der Wirtszelle nach Anspruch 8 oder 9 und die isolierung des GDF-8 oder funktionellen Fragments 
davon aus der Kultur. 

12. Antikorper oder Fragmente davon, die mit dem GDF-8 Oder den funktionellen Fragmenten davon nach Anspruch 
10 reagieren konnen. 

13. Antikorper nach Anspruch 12, wobei die Antikorper polyklonal oder monoklonal sind. 

14. Diagnostische Zusammensetzung, umfassend den Antikorper oder das Fragment davon nach Anspruch 12 oder 
13. 

1 5. Verfahren zum Nachweis einer Zellproliferationsstorung in vitro, umfassend das Kontaktieren des Antikorpers oder 
Fragments davon nach Anspruch 12 oder 13 mit einer Probe von einem Individuum, von dem angenommen wird, 
daB es eine mit GDF-8 assoziierte Storung aufweist, und Nachweisen der Bindung des Antikorpers oder des 
Fragments davon. 

16. Verfahren nach Anspruch 15, wobei die Probe eine Muskelzelle umfaBt. 

17. Verfahren nach Anspruch 15 oder 16, wobei der Antikorper oder das Fragment davon nachweisbar markiert ist. 

18. Verfahren nach Anspruch 17, wobei die Markierung ein Radioisotop, eine fluoreszierende Verbindung, eine bio- 
lumineszierende Verbindung, eine chemolumineszierende Verbindung oder ein Enzym ist. 

19. Antisense-Sequenz, welche komplementar zu mindestens 15 Nukleotiden der Polynukleotidsequenz nach irgend- 
einem der Anspruche 1 bis 4 ist und imstande ist, damit unter stringenten Bedingungen zu hybridisieren. 

20. Ribozym, welches imstande ist, die Polynukleotidsequenz nach irgendeinem der Anspruche 1 bis 4 zu erkennen 
und zu spalten. 
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21. Therapeutische Zusammensetzung, umfassend einen Antikorper oder ein Fragment davon nach Anspruch 12 
Oder 13, eine Antisense-Sequenz nach Anspruch 19 Oder ein Ribozym nach Anspruch 20. 

22. Verwendung eines Antikorpers oder Fragments davon nach Anspruch 12 oder 13, einer Antisense-Sequenz nach 
Anspruch 19 oder eines Ribozyms nach Anspruch 20 als Reagenz, welches die GDF-8-Aktivitat unterdruckt, zur 
Herstellung einer Zusammensetzung zur Behandlung einer Zellproliferationsstorung, die mit der Expression von 
GDF-8 assoziiert ist. 

23. Verwendung nach Anspruch 22, wobei die Zelle eine Muskelzelle ist. 

24. Verwendung nach Anspruch 22 oder 23, wobei das Reagenz, welches die GDF-8-Aktivitat unterdruckt, mit Hilfe 
eines Vektors in eine Zelle eingefuhrt wird. 

25. Verwendung nach Anspruch 24, wobei der Vektor ein kolloidales Dispersionssystem ist. 

26. Verwendung nach Anspruch 25, wobei das kolloidale Dispersionssystem ein Liposom ist. 

27. Verwendung nach Anspruch 26, wobei das Liposom im wesentlichen zielspezifisch ist. 

20 28. Verwendung nach Anspruch 26 oder 27, wobei das Liposom anatomisch fur ein Ziel bestimmt wird. 

29. Verwendung nach irgendeinem der Anspruche 26 bis 28, wobei das Liposom mechanistisch fur ein Ziel bestimmt 
wird. 

25 30. Verwendung nach Anspruch 29, wobei die mechanistische Zielbestimmung passiv oder aktiv erfolgt. 

31. Verwendung nach Anspruch 30, wobei das Liposom durch Kopplung mit einer Gruppierung, die aus der Gruppe 
bestehend aus einem Zucker, einem Glycolipid und einem Protein ausgewahlt ist, aktiv fur ein Ziel bestimmt wird. 

30 32. Verwendung nach Anspruch 24, wobei der Vektor ein Virus ist. 

33. Verwendung nach Anspruch 32, wobei das Virus ein RNA-Virus ist. 

34. Verwendung nach Anspruch 33, wobei das RNA-Virus ein Retrovirus ist. 

35. Verwendung nach Anspruch 34, wobei das Retrovirus im wesentlichen zielspezifisch ist. 



35 



40 



45 



36. Verwendung nach Anspruch 35, wobei eine Gruppierung fur die Zielspezifitat von einem Polynukleotid codiert 
wird, welches in das retrovirate Genom inseriert ist. 

37. Verwendung nach Anspruch 36, wobei eine Gruppierung fur die Zielspezifitat aus der Gruppe bestehend aus einem 
Zucker, einem Glycolipid und einem Protein ausgewahlt ist. 

38. Verwendung nach Anspruch 31 oder 37, wobei das Protein ein Antikorper ist. 
Revendications 

1. Sequence polynucleotidique codant pour un polypeptide, le facteur de croissance et de differenciation 8 (GDF-8), 
50 ou une partie de celui-ci s^lectionnee parmi le groupe comprenant : 

(a) la SEQ ID n° 11; 

(b) les nucleotides 151 k 1282 de la SEQ ID n° 11; 

(c) les nucleotides 952 a 1282 de la SEQ ID n° 11; 
55 (d) la SEQ ID n° 13; 

(e) les nucleotides 106 k 1233 de la SEQ ID n° 13; 

(f) les nucleotides 904 & 1233 de la SEQ ID n° 13; 

(g) des sequences qui sont d§g6nerees comme permis par le code genetique par rapport & celles de (a) & (f); 
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(h) des sequences qui sont complementaires de celles de (a) a (g), et 

(i) des fragments de (a) a (h) qui sont au moins d'une longueur de 1 5 bases et qui realiseront selectivement 
une hybridation dans des conditions stringentes avec I'ADN genomique qui code pour ia proteine GDF-8 des 
SEQ lDn° 12 ou 14. 

5 

2. Sequence polynucleotidique suivant la revendication 1 , dans laquelle le polynucleotide est isote d'une cellule de 
mammifere. 

3. Polynucleotide suivant la revendication 2, dans lequel la cellule de mammifere est une cellule de souris, de rat ou 
10 humaine. 

4. Sequence polynucleotidique ou des fragments de celle-ci suivant Tune quelconque des revendications 1 a 3, qui 
sont des sequences d'ADN. 

15 5. Vecteur d'expression comprenant une sequence d'ADN suivant la revendication 4. 

6. Vecteur suivant la revendication 5, qui est un plasmide. 

7. Vecteur suivant la revendication 5, qui est un virus. 

20 

8. Cellule h6te transformee de maniere stable par le vecteur suivant I'une quelconque des revendications 5 a 7. 

9. Cellule h6te suivant la revendication 8, dans laquelle la cellule est procaryote ou eucaryote. 

25 10. GDF-8 ou un fragment fonctionnel de celui-ci code par une sequence polynucleotidique ou d'ADN suivant I'une 
quelconque des revendications 1 a 4. 

11. Procede pour la production du GDF-8 ou d'un fragment fonctionnel de celui-ci suivant la revendication 10, com- 
prenant la mise en culture de la cellule note suivant la revendication 8 ou 9 et I'isolement dudit GDF-8 ou d'un 

30 fragment fonctionnel de celui-ci a partir de la culture. 

12. Anticorps ou fragments de ceux-ci reagissant avec le GDF-8 ou des fragments fonctionnels de celui-ci suivant la 
revendication 10. 

35 13. Anticorps suivant la revendication 12, dans lesquels les anticorps sont polyclonaux ou monoclonaux. 

14. Composition de diagnostic comprenant I'anticorps ou un fragment de celui-ci suivant la revendication 12 ou 13. 

15. Procede de detection d'un trouble de proliferation cellulaire, in vitro, comprenant la mise en contact de I'anticorps 
40 ou d'un fragment de celui-ci suivant la revendication 12 ou 13 avec un echantillon d'un sujet dont on pense qu'it 

presente un trouble associe au GDF-8 et la detection d'une liaison de I'anticorps ou du fragment de celui-ci. 

16. Procede suivant la revendication 15, dans lequel I'echantillon comprend une cellule musculaire. 

45 1 7. Procede suivant la revendication 1 5 ou 1 6, dans lequel I'anticorps ou le fragment de celui-ci est marque de maniere 
detectable. 

18. Procede suivant la revendication 17, dans lequel le marqueur est un radioisotope, un compose fluorescent, un 
compose bioluminescent, un compose chimioluminescent ou une enzyme. 

50 

19. Sequence anti-sens qui est complementaire a et peut realiser une hybridation dans des conditions stringentes 
avec au moins 1 5 nucleotides de la sequence polynucleotidique suivant I'une quelconque des revendications 1 a 4. 

20. Ribozyme pouvant reconnaitre et couper la sequence polynucleotidique suivant I'une quelconque des revendica- 
55 tions 1 a 4. 

21. Composition therapeutique comprenant un anticorps ou un fragment de celui-ci suivant la revendication 1 2 ou 1 3, 
une sequence anti-sens suivant la revendication 19 ou un ribozyme suivant la revendication 20. 
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22. Utilisation d'un anticorps ou d'un fragment de celui-ci suivant la revendication 12 ou 13, d'une sequence anti-sens 
suivant la revendication 19 ou d'un ribozyme suivant la revendication 20 comme reactif qui reprime I'activite du 
GDF-8 pour la preparation d'une composition pour le traitement d'un trouble de proliferation cellulaire associe a 
{'expression du GDF-8. 

23. Utilisation suivant la revendication 22, dans laquelle ladite cellule est une cellule musculaire. 

24. Utilisation suivant la revendication 22 ou 23, dans laquelle le reactif qui reprime I'activite du GDF-8 est introduit 
dans une cellule en utiiisant un vecteur. 

25. Utilisation suivant la revendication 24, dans laquelle le vecteur est un systeme de dispersion colloidal. 

26. Utilisation suivant la revendication 25, dans laquelle le systeme de dispersion colloidale est un liposome. 

27. Utilisation suivant la revendication 26, dans laquelle le liposome est essentiellement specifique pour une cible. 

28. Utilisation suivant la revendication 26 ou 27, dans laquelle le liposome est cible sur le plan anatomique. 

29. Utilisation suivant I'une quelconque des revendications 26 a 28, dans laquelle le liposome est cible sur le plan 
mecanistique. 

30. Utilisation suivant la revendication 29, dans laquelle le ciblage mecanistique est passif ou actif. 

31 . Utilisation suivant la revendication 30, dans laquelle le liposome est cible de maniere active par un couplage avec 
une partie selectionnee parmi le groupe comprenant un sucre, un glycolipide et une proteine. 

32. Utilisation suivant la revendication 24, dans laquelle le vecteur est un virus. 

33. Utilisation suivant la revendication 32, dans laquelle le virus est un virus a ARN. 

34. Utilisation suivant la revendication 33, dans laquelle le virus a ARN est un retrovirus. 

35. Utilisation suivant la revendication 34, dans laquelle le retrovirus est essentiellement specifique pour une cible. 

36. Utilisation suivant la revendication 35, dans laquelle une partie generant une specificity pour une cible est codee 
par un polynucleotide insere dans le genome retroviral. 

37. Utilisation suivant la revendication 36, dans laquelle une partie generant une specificite pour une cible est selec- 
tionnee parmi le groupe comprenant un sucre, un glycolipide et une proteine. 

38. Utilisation suivant la revendication 31 ou 37, dans laquelle la proteine est un anticorps. 
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I T TAAGGTACGAAGGAT T TCAGGCTCTA T TT ACATAAT TGTTCT TTCCT T TTCACACAGAA 60 

N 

61 TCCCT T T TTAGAAGTCAAGG TGACAGACACACCCAAGAGG TCCOGGAGAGACTT TCGGCT 120 

PFLEVKVTOT P (O S QQ] D F G L 
121 TGACTGCGATGAGCACTCCACGGAATCCCGGTGCTGCCGCTACCCCCTCACGGTCGATTT 180 

0 C 0 E H S T E S R C C R Y P L T V D F 
181 TGAAGCCTTTGGATGGGACTGGATTATCGCACCCAAAAGATATAAGGCCAATTACTGCTC 240 

E A F G W D MM IAPKRYKANYCS 
241 AGGAGAGTGTGAATTTGTGT T T T TACAAAAATATCCGCATACTCATCT TGTGCACCAAGC 300 

GECEFVFL0KYPHTHLVH0A 
301 AAACCCCAGAGGCTCAGCAGGCCCTTGCTGCACTCCGACAAAAATGTCTCCCATTAATAT 360 

NPRGSAGPCCTPTKMSP 1NM 
361 GCTATATTTTAATGGCAAAGAACAAATAATATATGGGAAAATTCCAGCCATGGTAGTAGA 420 

LYFNGKEQI IYGKIPAMVVD 
421 CCGCTGTGGGTGCTCATG AGCT T TGC ATTAGGT TAGAAACT TCCCAAG TCATGGAAGGTC 480 

R C G C S » 

481 TTCCCCTCAATTTCGAAACTGTGAATTCCIGCAGCCCGGGGGATCCACTAGTTCTAGAGC 540 
541 GGCCGCCACC 550 

FIG.2a 



1 CAAAAAGATCCAGAAGGGATTTTGGTCTTGACTGTGATGAGCACTCAACAGAATCACGAT 60 

a S Q 0FGLDC0EHSTESRC 
61 GCTGTCGTTACCCTCTAACTGTGGATTTTGAAGCTTTTGGATGGGATTGGATTATOGCTC 120 

CRYPLTVOFEAFGWDWI I A P 
121 CTAAAAGATATAAGGCCAATTACTGCTCTGGAGAGTGTGAATTTGTATTTTTACAAAAAT 180 

KRYKANYCSGECEFVFLOKY 
181 ATCCTCATACTCATCTGGTACACCAAGCAAACCCCAGAGGTTCAGCAGGCCCTTGCTGTA 240 

PHTHLVHOANPRGSAGPCCT 
241 CTCCCACAAAGATGTCTCCAATTAATATGCTATATTTTAATGGCAAAGAACAAATAATAT 300 

PTKMSPINMLYFNGKEQIIY 
301 ATGGGAAAATTCCAGCGATGGTAGTA 326 

C K I P A M V V 

FIG.2b 
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lnhibin0A 
Inhibin0B 
TGF- 01 
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REKROAKHKORKRLKS: 
KRSPKHHSQRARKKNKI 
SRGSGSSDYNGSEIKT, 
LRMANVAENSSS 




SRRDFGL DCDEHSTESRDCRYPL TVDF-EAFGWD-W J I APKRYKANYj^ECfFVFLQKYP 

RPffiDAEPVLGOTGARmWSF-ft 

IHPLYVDF-SDVGWNDWI VAPPGYHAFYlcfcECpFPLAOHLNS-- 
HSL YVDr-SOVGWNDW I VAPPGYQAF YfckjOpPFPLADHLNS — 
ELYVSF-OOLGWQOWI lAPKGYAANtoECjSFPLNAHMW— 
HELYVSF-ROLGVOJWllAPEGYAAYYjdEGECAFPLNSYMNA-- 
Sf^SVGOYNTSEWQACKKHELYVSF-ROLGWQWl lAPEGYAAFYj&dSFPLNAHMNA — 
EQTLKKARRKQWl EPRMteYLKVDF-AO I GWSEWl l SPKSFOAYYjCSGACpFPMPKSLKPS — 

GPGRAORSAG ATAAOGFjOM RE LSVOL RAERSVL I PETYOANNOX^WPOSDRNPRY— 

ALRLLQRPPEEPAAHAfidfliVALNISr-QELGWERWIVYPPSF IFHTljDHGQDGLHIPPNLSLPV- 
HRRRRRGLECDGKV-NICCKKCfFVSF-KOlGWNDWI lAPSGYHAwjuEGECPSHIAGTSGSSL- 
HRIRKRGLECDGRT-NUCCRQQFF IDF-RL lGWNDWl I APTGYYGNYjdEG^OPAYLAGVPGSAS- 

HRRALOTNYCF SSTEKNXVRQL Y I DFRKDlGWK-W I HEPKGYHANFdGPEPY I WSLD 

KKRALOAAYCFRNVQDNqCLRPLYl DFKRDLGWK-Wl HEPKGYNANF|clAGADPYLWSSD 

KKRALOTNYCFRNLEE MXVRPL Y l DFRGflLGWK-WVHEPKGYYANF|^PgPYLRSAD 



GDF-8 

GDF-1 

BMP-2 

BMP-4 

Vgr-1 

OP-1 

BMP-5 

BMP-3 

MIS 

inhibina 
)nhibin/JA 
Inhibin/iB 
TGF- 01 
TGF- 02 
TGF- 03 



-HTHLVHQANPRG- 



SAGPCQT— PTKMSP I NML YF -NGKEQ I I YGKIPAMWDF 



ALNHAVLRALMHA— AAPGAADL 
-TNHAIVOTLVNS— VNSKIPK, 
-TNHAIVQTLVNS — VNSSI 
-TNHA1VQTLVHL— MNPE 
-TNHAIVQTLVHF— INPETVPI 
-TNHA1VQTLVHL-4FP0HVPKI 
-NHATIOSIVRA-VGWPGIPE 
-CNHWLLLKMQA— RGAALARPI 

-PGAPPTPAOPYS LLPGAI 

-SFHSTV I NHYRMRGHSPFANLI 





'— PARISP I SVLFF-ONSONWLRQYEOMWOI 
'-PTELSAISMLYL-OENEKWLKNYQOMWEi 
'-PTELSAISMLYL-OEYDKWLKNYQEMWEG^ . 
r-PTKLNAISVLYF-DDNSNVI LKKYRfWWRAQgC^ 
,-PTQLNAl SVL YF-DDSSNV I LKKYRNMWR, 
i— PTKLNA1SVLYF-0DSSNV1 LKKYRhWWR! 

PEKMSSLS I LFF-DENKNWIKVYPWTVE 
' — PTAYAGKLL I SLSEER— I SAHHVPNMVATE] 
LPGTMRPLHVRT TSDGGYSFKYETVPNLL Ti 




'—PTKLRPMSMIYY-DDGQNI IKKD10NMIVEEI 
-SFHTAWNQYRMRGL^1PGT-VNSCC^I — PTKLSTMSMLYF-DOEYNlVW?DVPNMlVEE]t. 
-TQYSKVLAIYNQ-HNTCASAAP^ 

-TQHSRVLSLYNT— I NPEASASPCCjv— SQOLEPLT I LYY- IGKTPK I -EQLSIWI WSCkjuS 
-TTHSTVLGLYNT-LNPEASASP^V-POOLEPLT 1 in-VGRTPKV-£&SN^SD& 



FIG.3 
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1 GTCTCTCGGACGGTACATGCACTAATATTTCACTTGGCATTACTCAAAAGCAAAAAGAAG 60 
61 AAATMGMCMG^AAAAAAAAAGATTGTGCTGATTTTTAAMTGATGCAAAM 120 

M M Q K L 0 

121 AATGTATGTTTATATTTACCTGTTCATGCTGATTGCTGCTGGCCCAGTGGATCTAAATGA 180 

MYVY1YLFML IAAGPVDINE 
181 GGGCAGTGAGAGAGAAGAAMTGTGGAAAAAGAG(X5GCTGTGTAATGCATGTGCGTGGAG 240 

GSEREENVEKEGLCNACAWR 
241 ACAAAACACGAGGTACTCCAGAATAGAAGCCATAAAAATICAAATCCTCAGTAAGCTGCG 300 

QNTRYSR1EAIKIQIISKLR 
301 CCTGGAAACAGCTCC TMCATCAG CAAAGATGCTATAAGACAACTTCTGCCAACAGCGCC 360 

L E T A P [N T.-.SJ KOAIROLLPRAP 
361 TCCACTCCGGGAACTGATOGATCAGTACGACGTCCAGAGGGATGACAGCAGTGATGGCTC 420 

PLRELJ'OOYOVOROOSSOGS 
421 TITGGAAGATGACGATTATCACGCTACCACGGAAACAATCATTACCATGCCTACAGAGTC 480 

LEODDYHATTETI 1TMPTES 
481 TGACTTTCTAATGCAAGCGGATGGCAAGCCCAAATGTTGCTTTTTTAAATTTAGCTCTAA 540 

OFLMOAOGKPKCCFFKFSSK 
541 MTACAGTACMCAAAGTAGTAAAAGCCCMCTGTCGATATATCTCAGACCCGTCAAGAC 600 

IQYNKVVKAQLW1YIRPVKT 
601 TCCIACMCAGTGTTTGTGCAMTCCTGAGACTCATCAAACCCATGAAAGACGGTACAAG 660 

PTTVFVOILRL IKPMKOGTR 
661 GTATACTGGAATCCGATCTCTGAAACTTGACATGAGCCCAGGCACTGGTATTTGGCAGAG 720 

YTGIRSLKLDMSPGTGIWQS 
721 TATTGATGTGAAGACAGTGTTGCAAAATTGGCTCAAACAGCCTGAATCCAACTTAGGCAT 780 

1DVKTVLONWLK0PESNLG I 
781 TGAAATCAAAGCTTTGGATGAGAATGGCCATGATCTTGCTGTAACCTTCCCAGGACCAGG 840 

E1KALDENGHDLAVTFPGPG 
841 AGAAGATGGGCTGMTCCCTTTTTAGAAGTCAAGGTGACAGACACACCCAAGAGGTCCCG 900 



EDGLNPFLEVKVTDTP K |P $ R 
901 GAGAGACT7TGCGCTTGACTGCGATGAGCACTCCACGCAATCCCGGTGCTGCCGCTACCC 960 

T)D FGLDCOEHSTESRCCRYP 
961 CCTCACGG7CGATTTTGAAGCCTTTGGATGGGACTGGATTATCGCACCCAAAAGATATAA 1020 

LTVDFEAFGWDW! IAPKRYK 
1021 GGCCAATTACTGCTCAGGAGAGTGTGAATTTGTG7TTTTACAAAAATATCCGCATACTCA 1080 

ANYCSGECEFVFLOKYPHTH 
1081 TCTTGTGCACCAAGCAAACCCCAGAGGCTCAGCAGGCCCTTGCTGCACTCCGACAAAAAT 1140 

LVHOANPRGSAGPCCTPTKM 
1141 GTCTCCCATTAATATGCTATATTTTAATGGCAAAGAACAAATAATATATGGGAAAATTCC 1200 

SPINMLYFNGKEQI 1YGKIP 
1201 AGCX>TGGT AGT AGACOGCTG TGGGTGCTCATG AGCTT TGCATT AGGT T AGAAACT TCCC 1260 

AMVVDRCGCS* 



FIG.5a 
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1261 AAGTCATGGAAGC TCT TCCCCTCAAT TTCGAAACTGTGAAT TCAAGCACCACAGGCTGT A 1320 

1321 GGCCTTGAGTATGCTCTAGTAACGTAAGCACAAGCTACAGTGTATGAACTAAAAGAGAGA 1380 

1381 AT AG ATGCAATGGTTGGCAT TCAACCA(XAAAAT AAACCATACT ATAGGATGT TG TATGA 1440 

1441 T TTCCAG AGT T T T TGAAAT AGATGGAGATCAAAT T ACAT T T ATGTCCAT AT ATGT AT AT T 1500 

1501 ACAACTACAATCT AGGCAAGGAAGTGAGAGCACATCTTG TGGTCTGCTGAG T TAGGAGGG 1560 

1561 TATGATTAAAAGGTAAAGTCTTATTTCCTAACAGTTTCACTTAATATTTACAGAAGAATC 1620 

1621 TATATGTAGCCTTTGTAAAGTGTAGGATTGTTATCATTTAAAAACATCATGTACACTTAT 1680 

1681 ATTTGTATTGTATACTTGGTAAGATAAAATTCCACAAAGTAGGAATGGGGCCTCACATAC 1740 

1741 ACATTGCCATTCCTATTATAATTGGACAATCCACCAGGGTGCTAATGCAGTGCTGAATGG 1800 

1801 CTCCTACTGGACCTCTCGATAGAACACTCTACAAAGTACGAGTCTCTCTCTCCCTTCCAG 1860 

1861 GTGCATCTCCACACACACAGCACTAAGTGTTCAATGCATTTTCTTTAAGGAAAGAAGAAT 1920 

1921 CTTTTTTTCTAGAGGTCAACTTTCAGTCAACTCTAGCACAGCGGGAGTGACTGCTGCATC 1980 

1981 TTAAAAGGCAGCCAAACAGTATTCATTTTTTAATCTAAATTTCAAAATCACTGTCTGCCT 2040 

2041 TTATCACATGGCAATTTTGTGGTAAAATAATGGAAATGACTGGTTCTATCAATATTGTAT 2100 

2101 AAAAGACTCTGAAACAATTACATTTATATAATATGTATACAATATTGTTTTGTAAATAAG 2160 

2161 TGTCTCCTTTTATATTTACTTTGGTATATTTTTACACTAATGAAATTTCAAATCATTAAA 2220 

2221 GTACAAAGACATGTCATGTATCACAAAAAAGGTGACTGCTTCTATTTCAGAGTGAATTAG 2280 

2281 CAGATTCAATAGTGGTCTTAAAACTCTGTATGTTAAGATTAGAAGGTTATATTACAATCA 2340 

2341 ATTTATGTATTTTTTACATTATCAACTTATGGTTTCATGGTGGCTGTATCTATGAATGTG 2400 

2401 GCTCCCAGTCAAATTTCMTGCCCCACCATTTTAAAAATTACAAGCATTACTAAACATAC 2460 

2461 CAACATGTATCTAAAGAAATACAAATATGGTATCTCAATAACAGCTACTTTTTTATTTTA 2520 

2521 TAATTTGACAATGAATACATTTCTTTTATTTACTTCAGTTTTATAAATTGGAACTTTGTT 2580 

2581 TATCAAATGTATTGTACTCATAGCTAAATGAAATTATTTCTTACATAAAAATGTGTAGAA 2640 

2641 ACTATAAATTAAAGTGTTTTCACATTTTTGAAAGGC 2676 

FIG.5b 
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1 AAGAAAAGTAAAAGGAAGAAACAAGAACAAGAAAAAAGATTATATTGATTTTAAAATCAT 60 

M 

61 GCAAAAACTGCAACTCTGTGT TTATAT T T ACCTGTT TATGCTGAT TG T TGCTGGTCCAG T 120 
QKLQLCVYIYLFML IVAGPV 

121 GGATCTAAATGAGAACAGTGAGCAAAAAGAAAATGTGGAAAAAGAGGGGCTGTGTAATGC 180 
DLNENSEQKENVEKEGLCNA 

181 ATGTACT7GGAGACAAAACACTAAATCTTCAAGAATAGAAGCCATTAAGATACAAATCCT 240 
CTWRQNTKSSR1EA1KIQIL 

241 CAGTAAACTTCGTCTGGAAACAGCTCCTAACATCAGCAAAGATGTTATAAGACAACTTTT 300 



SKLRLETA P IN ..I: SI K 0 V I R Q L L 



301 ACCCAAAGCTCCTCCACTCCGGGAACTGATTGATCAGTATGATGTCCAGAGGGATGACAG 360 

PKAPPLREL IOQYDVORDOS 
361 CAGCGATGGCTCTTTGGAAGATGACGATTATCACGCTACAACGGAAACAATCATTACCAT 420 

SDGSLEOODYHATTETIITM 
421 GCCT AC AG AGTCTGATT T ?CT AATGCAAGTGGATGGAAAACCCAAATG TTGCT TC T TT AA 480 

PTESDFLMOVOGKPKCCFFK 
481 ATTTAGCTCTAAAATACAATACAATAAAGTAGTAAAGGCCCAACTATGGATATATTTGAG 540 

FSSKIQYNKVVKAQIWlYLR 
541 ACCCGTCGAGACTCCTACAACAGTGTTTGTGCAAATCCTGAGACTCATCAAACCTATGAA 600 

PVETPTTVFVOILRLIKPMK 
601 AGACGGTACAAGGTATACTGGAATCCGATCTCTGAAACTTGACATGAACCCAGGCACTGG 660 

DGTRYTGIRSLKLDMNPGTG 
661 TATTTGGCAGAGCATTGATGTGAAGACAGTGTTGCAAAATTGGCTCAAACAACCTGAATC 720 

IWQSIOVKTVLONWLKOPES 
721 CAACTTAGGCATTGAAATAAAAGCTTTAGATGAGAATGGTCATGATCTTGCTGTAACCTT 780 

N L G I E 1 K AL DENGHDLAVTF 
781 CCCAGGACCAGGAGAAGATGGGCTGAATCCGTTTTTAGAGGTCAAGGTAACAGACACACC 840 

PGPGEOGLNPFIEVKVTOTP 
841 AAAAAGATCCAGAAG^ATTTTGGTCTTGACTGTGATGAGCACTCAACAGAATCACGATG 900 



K IR S R RI D FGLDCDEHSTESRC 
901 CTGTCGTTACCCTCTAACTGTGGATTTTGAAGCTTTTGGATGGGATTGGATTATCGCTCC 960 

CRYPLTVOFEAFGWOWI IAP 
961 TAAAAGATATAAGGCCAATTACTGCTCTGGAGAGTGTGAATTTGTATTTTTACAAAAATA 1020 

KRYKANYCSGECEFVFLQKY 
1021 TCCTCATACTCATCTGGTACACCAAGCAAACCCCAGAGGTTCAGCAGGCCCTTGCTGTAC 1080 

PHTHLVHQANPRGSAGPCCT 
1081 TCCCACAAAGATGTCTCCAATTAATATGCTATATTTTAATGGCAAAGAACAAATAATATA 1140 

PTKMSPINMLYFNGKE01 IY 
1141 TGGGAAAATTCCAGCGATGGTAGTAGACCGCTGTGGGTGCTCATGAGATTTATATTAAGC 1 200 

GKIPAMVVORCGCS* 
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1201 GTTCATAACTTCCTAAAACATGGAAGGTTTTCCCCTCAACAATTTTGAAGCTGTGAAATT 1260 

1261 AAGTACCACAGGCTATAGGCCrAGAGTATGCTACAGTCACTTAAGCATAAGCTACAGTAT 1320 

1321 GTAAACTAAMGGGGGAATATATGCAATGGnGGCATTTAACCATCCAAACAAATCATAC 1380 

1381 AACAAAGT T TT ATGAT7 TCCAG AGT T T T TGAGCT AGAAGGAG ATCAAAT TACAT T T ATG T 1440 

1441 TCCTATATATlACAACATCGGCGAGGAAATGAAAGCGATTCTCCTTGAGnCTGATGAAT 1500 

1501 TAAAGGAGTATGCTTTAAAGTCTATTTCTTTAAAGTTTTGTTTAATATTTACAGAAAAAT 1560 

1561 CCACATACAGTATTGGTAAAATGCAGGATTGTTATATACCATCATTGGAATCATCCTTAA 1620 

1621 ACACTTGAAT7TATATTGTATGGTAGTATACTTGGTAAGATAAAATTCCACAAAAATAGG 1680 

1681 GATGGTGCAGCATATGCAATTTCCATTCCTATTATAATTGACACAGTACATTAACAATCC 1740 

1741 ATGCCAACGGTGCTAATACGATAGGCTGAATGTCTGAGGCTACCAGGTnATCACATAAA 1800 

1801 AAACATTCAGlAAAATAGTAAGTTTCTCTTTTCTTCAGGTGCATTTTCCTACACCTCCAA 1860 

1861 ATGAGGAATGGATTTTCTTTAATGTAAGAAGAATCATTTTTCTAGAGGTTGGCTTTCAAT 1920 

1921 TCTG T AGCATACT TGGAGAAACTGCATTATCTTAAAAGGCAGTCAAATGGTGTT TG T TTT 1980 

1981 TATCAAAATGTCAAAATAACATACTTGGAGAAGTATGTAATTTTGTCTTTGGAAAATTAC 2040 

2041 AACACTGCCTTTGCAACACTGCAGTT T TT ATGGT AAAAT AATAGAAATGATCGACTCTAT 2100 

2101 CAATATTGTATAAAAAGACTGAAACAATGCATTTATATAATATGTATACAATATTGTTTT 2160 

2161 GTAAATAAGTGTCTCCTTTTTTATTTACTTTGGTATATTTTTACACIAAGGACATTTCAA 2220 

2221 ATTAAGTACTAAGGCACAAAGACATGTCATGCATCACAGAAAAGCAACTACTTATATTTC 2280 

2281 AGAGCAAATTAGCAGATTAAATAGTGG1CTTAAAACTCCATA7GTTAATGATTAGATGG1 2340 

2341 TATATTACAATCATTTIATATTTTTTTACATGATTAACATTCACTTATGGATTCATGATG 2400 

2401 GCTGTATAAAGTGAATTTGAAATTTCAATGGTTTACTGTCATTGTGTTTAAATCTCAACG 2460 

2461 TTCCATTATTTTAATACTTGCAAAAACATTACTAAGTATACCAAAATAATTGACTCTATT 2520 

2521 ATCTGAAATGAAGAATAAACTGATGCTATCTCAACAATAACTGTTACTTTTATTTTATAA 2580 

2581 TTTGATAATGAATATATTTCTGCATTTATTTACTTCTGTTTTGTAAA7TGGGATTTTGTT 2640 

2641 AATCAAATTTATTGTACTATGACTAAATGAAATTATTTCTTACATCTAATTTGTAGAAAC 2700 

2701 AGTATAAGTTATATTAAAGTGTTTTCACATTTTTTTGAAAGAC 2743 
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t fcWQKLQMYVY I YLFML IAAGPVDLNEGSEREENftKEGLCNACAWRQNTR 50 

mil iiiiiiiii iiiiini ii minium \\\\\ 

1 MQKLQLCVYI YLFML I VAGPVDLNENSEQKENVEKEGLCNACTWRQNTK 49 
51 YSRIEAIKIQ1LSKLRLETAPNISK0AIR0LLPRAPPLRELIDOYDVQRO 100 

. iiiiiiiiiiiiiiiiiiiiiiiii huh i ii 1 1 j j i r 1 1 1 1 1 1 1 

50 SSRIEAIKIQILSKLRLETAPNiSKOVIRQLLPKAPPLRELlOOYOVQRO 99 

101 OSSDGSLEDOOYHATTET 1 1 TMPTESDFLMQADGKPKCCFFKFSSKIQYN 150 
nn lllllllllllllllllllllllllllllll llllllllllllllllll 
100 OSSDGSLEDOOYHATTET 1 1 TMPTESOFLMQVOGKPKCCFFKFSSKIOYN 149 

151 KWKAQLWl YLRPVKTPTTVFVQ 1 LRL I KPMKDGTRYTG I RSLKLDMSPG 200 

IHIIIIIIIIHI HllllllllllliniHIIHIIIIIIIH II 

150 KWKAQLWl YLRPVETPTT VFVOl LRL I KPMKDGTRYTG I RSLKLDMNPG 199 

201 TG 1 WQSI0VKTVL(3NWLK0PESNLG I E 1 KALDENGHDLAVTFPGPGEDGL 250 

_ llllllllllllllllllllllllllllllllinillllllllllllll 
200 TGJWOSIDVKTVLQNWLKOPESNLGIEIKALDENGHOLAVTFPGPGEDGL 249 

251 NPFLEVKVTDTPKRSRROFGLOCDEHSTESRCCRYPL TVOFEAFGWDWI I 300 

IIIHIIinillHIIHIIIIIIKIIIIIIHIIIIIIHinilll 

250 NPFLEVKVTDTPKRSRROFGLOCOEHSTESRCCRYPLTVDFEAFGWDWIl 299 

301 APKRYKANYCSGECEFVFLQKYPHTHLVHQANPRGSAGPCCTPTKMSPiN 350 

llllllllllllllllllllllllllllllllllllllllilllllllll 
300 APKRYKANYCSGECEFVFLQKYPHTHLVHQANPRGSAGPCCTPTKMSPIN 349 

351 MLYFNGKEQI IYGKJPAMWDRCGCS 376 

llllllllllllllllllllllllll 
350 MLYFNGKEQI I YGKIPAMWDRCGCS 375 
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